Data Containers¶

约 1623 个字 238 行代码 1 张图片预计阅读时间 8 分钟

Taichi Fields¶

Taichi 中的Field是一个全局的数据容器，从 Python scope 和 Taichi scope 都能访问，但必须在 Python scope 中才能声明一个Field，Field的维度最高为 8

标量 fields ¶

Scalar field 存储的是标量，是最基本的 field

一个 0D 标量 field 是单个标量
一个 1D 标量 field 是由标量组成的一个一维数组
一个 ND 标量 field 是由标量组成的一个 N 维数组

声明 ¶

可以通过ti.field(dtype, shape)来声明一个标量 field，dtype是 Taichi 中的基本类型，shape是一个由整数构成的元组，此外，可以通过field.dtype和field.shape来获得对应的属性

每当声明一个 field 后，Taichi 会自动将其元素初始化为 0

f_0d = ti.field(ti.f32, shape=())  # 0D field
f_1d = ti.field(ti.i32, shape=9)  # A 1D field of length 9

f1[None] = 1  # Must use `None` to access a 0D field
f2[0] = 1  # Use 0 to access a 1D field of length 1

f_2d = ti.field(int, shape=(3, 6))  # A 2D field in the shape (3, 6)

Warning

Taichi 的 fields 不支持切片，以下行为都是不被允许的

for x in f_2d[0]:  # Error! You tried to access its first row，but it is not supported
    ...


f_2d[0][3:] = [4, 5, 6]  # Error! You tried to access a slice of the first row, but it is not supported

可以通过field.fill()的方式用一个给定值填充 field

x = ti.field(int, shape=(5, 5))
x.fill(1)  # Sets all elements in x to 1

@ti.kernel
def test():
    x.fill(-1)  # Sets all elements in x to -1

向量 fields ¶

声明 ¶

可以通过ti.Vector.field(n, dtype, shape)来声明一个n维的向量 field，dtype和shape与标量 field 一样

# Declares a 3x3 vector field comprising 2D vectors
f = ti.Vector.field(n=2, dtype=float, shape=(3, 3))


box_size = (300, 300, 300)  # A 300x300x300 grid in a 3D space

# Declares a 300x300x300 vector field, whose vector dimension is n=4
volumetric_field = ti.Vector.field(n=4, dtype=ti.f32, shape=box_size)

访问 0 维的向量 field 中的元素同样需要用None来作为索引

x = ti.Vector.field(n=3, dtype=ti.f32, shape=()) # A 0D vector field

item1 = x[None][0]

矩阵 fields ¶

声明 ¶

可以通过ti.Matrix.field(N, M, dtype, shape)来声明一个NxM的矩阵 field，dtype和shape与标量 field 一样

# Declares a 300x400x500 matrix field, each of its elements being a 3x2 matrix
tensor_field = ti.Matrix.field(n=3, m=2, dtype=ti.f32, shape=(300, 400, 500))

访问 0 维的矩阵 field 中的元素同样需要用None来作为索引

x = ti.Matrix.field(n=3, m=4, dtype=ti.f32, shape=()) # A 0D matrix field

item1 = x[None][0, 1]

对 Matrix 的操作会在编译时展开，下面是一个例子

import taichi as ti
ti.init()

a = ti.Matrix.field(n=2, m=3, dtype=ti.f32, shape=(2, 2))
@ti.kernel
def test():
    for i in ti.grouped(a):
        # a[i] is a 2x3 matrix
        a[i] = [[1, 1, 1], [1, 1, 1]]
        # The assignment is unrolled to the following at `compile time`:
        # a[i][0, 0] = 1
        # a[i][0, 1] = 1
        # a[i][0, 2] = 1
        # a[i][1, 0] = 1
        # a[i][1, 1] = 1
        # a[i][1, 2] = 1

因此尽量不要使用太大的 field dimension，而是使用比较大的 matrix shape

Example

# not recommend
m = ti.Matrix.field(64, 32, dtype=ti.f32, shape=(3, 2))

# recommend
m = ti.Matrix.field(3, 2, dtype=ti.f32, shape=(64, 32))

结构体 fields ¶

声明 ¶

可以通过ti.Struct.field(members, shape)来声明一个结构体 field，其中members是字典类型的变量，shape是元组类型的变量

Taichi 的结构体 field 有两种访问元素的方式：index-first 和 name-first

Example

# Sets the position of the first particle in the field to [0.0, 0.0, 0.0]
particle_field[0].pos = vec3(0) # particle_field is a 1D struct field, pos is a 3D vector

# Sets the mass of the first particle in the field to 1.0
particle_field.mass[0] = 1.0

# Sets all mass of the particles in the struct field to 1.0
particle_field.mass.fill(1.0)

组织高效的数据布局 ¶

高效数据布局的核心原则是局部性，一般高效的数据布局至少有以下特点之一

稠密的数据结构
小范围数据循环
顺序加载与存储数据

Taichi 中提供了灵活的语句ti.root.X用于描述更复杂的数据组织

声明一个 0 维 filed

x = ti.field(ti.f32)
ti.root.place(x)

# is# equivalent to:
x = ti.field(ti.f32, shape=())

声明一个形状为 3 的 1 维 field

x = ti.field(ti.f32)
ti.root.dense(ti.i, 3).place(x) # `ti.i`

# is equivalent to:
x = ti.field(ti.f32, shape=3)

声明一个形状为 (3, 4) 的 2 维 field

x = ti.field(ti.f32)
ti.root.dense(ti.ij, (3, 4)).place(x) # `ti.ij`
# is equivalent to:
x = ti.field(ti.f32, shape=(3, 4))

# nest use of dense is also available
x = ti.field(ti.f32)
ti.root.dense(ti.i, 3).dense(ti.j, 4).place(x)

上述用嵌套dense语句构建的二维数组和用ti.field构建的二维数组不完全相同，虽然这两种语句都会产生相同形状的二维数组，但它们的SNodeTree层级不一样，如下所示

ti.root.X语句逐步将Field的形状绑定到对应的轴，通过多个语句的嵌套，我们可以构建一个更高维度的Field

AoS 和 SoA ¶

AoS 全称 array of structures（数组结构体），SoA 全称 structure of arrays（结构体数组），一个带有 4 个像素和 3 个颜色通道的 RGB 图像：AoS 布局存储为RGBRGBRGBRGB，而 SoA 布局存储为RRRRGGGGBBBB，选择 AoS 还是 SoA 布局很大程度上取决于 field 的访问模式

可以通过ti.root.X语句构建 AoS 和 SoA

# SoA field
x = ti.field(ti.f32)
y = ti.field(ti.f32)
ti.root.dense(ti.i, M).place(x)
ti.root.dense(ti.i, M).place(y)

#  address: low ................................. high
#           x[0]  x[1]  x[2] ... y[0]  y[1]  y[2] ...

# AoS field
x = ti.field(ti.f32)
y = ti.field(ti.f32)
ti.root.dense(ti.i, M).place(x, y)

#  address: low .............................. high
#           x[0]  y[0]  x[1]  y[1]  x[2]  y[2] ...

管理内存占用 ¶

一般情况下，Taichi 对内存的分配和销毁是不可见的，不过，我们有时会需要手动管理内存分配

针对这种情况，Taichi 提供了FieldsBuilder，用于支持 field 相关内存的手动分配和销毁，FieldsBuilder和ti.root有相同的声明 API，但还需要在所有声明之后调用finalize()，finalize()返回一个SNodeTree对象用于处理随后的内存销毁

Example

import taichi as ti
ti.init()

@ti.kernel
def func(v: ti.template()):
    for I in ti.grouped(v):
        v[I] += 1

fb1 = ti.FieldsBuilder()
x = ti.field(dtype=ti.f32)
fb1.dense(ti.ij, (5, 5)).place(x)
fb1_snode_tree = fb1.finalize()  # Finalizes the FieldsBuilder and returns a SNodeTree
func(x)
fb1_snode_tree.destroy()  # Destruction

fb2 = ti.FieldsBuilder()
y = ti.field(dtype=ti.f32)
fb2.dense(ti.i, 5).place(y)
fb2_snode_tree = fb2.finalize()  # Finalizes the FieldsBuilder and returns a SNodeTree
func(y)
fb2_snode_tree.destroy()  # Destruction

Taichi Ndarray¶

Taichi ndarray 和 numpy ndarray 很相近，但是它的底层内存是由 Taichi 架构分配的，并且由 Taichi 的运行时管理

Taichi ndarray 会分配一个连续的内存块，并允许与外部库进行直接的数据交换 (numpy ndarray/torch tensor)，相比 Taichi field 来说，更适合用于稠密的数据

可以用ti.ndarray来声明一个 Taichi ndarray，dtype可以是基本类型，也可以是 matrix/vector 这些，要注意只能在 Python scope 中声明一个 ndarray，并且其中的所有元素会被初始化为 0

arr = ti.ndarray(dtype=ti.math.vec3, shape=(4, 4))

Ndarray 的常用运算 ¶

用标量值填充 ndarray

arr.fill(1.0)

从 Python scope 读取 / 写入 ndarray 元素

# Returns a ti.Vector, which is a copy of the element
print(arr[0, 0]) # [1.0, 1.0, 1.0]

# Writes to an element
arr[0, 0] = [1.0, 2.0, 3.0] # arr[0, 0] is now [1.0, 2.0, 3.0]

# Writes to a scalar inside vector element
arr[0, 0][1] = 2.2  # arr[0, 0] is now [1.0, 2.2, 3.0]

Ndarrays 的数据拷贝

import copy
# Copies from another ndarray with the same size
b = ti.ndarray(dtype=ti.math.vec3, shape=(4, 4))
b.copy_from(arr)  # Copies all data from arr to b

# Deep copy
c = copy.deepcopy(b)  # c is a new ndarray that has a copy of b's data.

# Shallow copy
d = copy.copy(b)  # d is a shallow copy of b; they share the underlying memory
d[0, 0][0] = 1.2  # This mutates b as well, so b[0, 0][0] is now 1.2

与 NumPy ndarrays 的数据交换（这一部分也可以看与外部数据进行交互 )

# to_numpy returns a NumPy array with the same shape as d and a copy of d's value
e = d.to_numpy()

# from_numpy copies the data in the NumPy array e to the Taichi ndarray d
e.fill(10.0)  # Fills in the NumPy array with value 10.0
d.from_numpy(e)  # Now d is filled in with 10.0

在 Taichi kernel 中传入 Taichi ndarrays，传入的是变量的引用

@ti.kernel
def foo(A: ti.types.ndarray(dtype=ti.f32, ndim=2)):
    do_something(A)

其中dtype和ndim参数如果没有指定的话，Taichi 会从输入的 ndarray 推断这两个值，如果指定了那么 Taichi 会验证传入的 ndarray 是否和参数声明一致，不一致会报错

外部数组可以传入 Taichi kernel，而无需进一步的类型转换

ti.init(arch=ti.cuda)

@ti.kernel
def add_one(arr : ti.types.ndarray(dtype=ti.f32, ndim=2)):
    for i in ti.grouped(arr):
        arr[i] = arr[i] + 1.0

arr_np = np.ones((3, 3), dtype=np.float32)
add_one(arr_np) # arr_np is updated by taichi kernel

arr_torch = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]], device='cuda:0')
add_one(arr_torch) # arr_torch is updated by taichi kernel

Kernel 编译时的 ndarray 模板复用（不指定 ndarray 的类型和大小）

@ti.kernel
def test(arr: ti.types.ndarray()):
    for I in ti.grouped(arr):
        arr[I] += 2

a = ti.ndarray(dtype=ti.math.vec3, shape=(4, 4))
b = ti.ndarray(dtype=ti.math.vec3, shape=(5, 5))
c = ti.ndarray(dtype=ti.f32, shape=(4, 4))
d = ti.ndarray(dtype=ti.f32, shape=(8, 6))
e = ti.ndarray(dtype=ti.math.vec3, shape=(4, 4, 4))
test(a) # New kernel compilation
test(b) # Reuse kernel compiled for a
test(c) # New kernel compilation
test(d) # Reuse kernel compiled for c
test(e) # New kernel compilation

这个编译规则也适用于 NumPy 或 PyTorch 的数据

空间稀疏数据结构 ¶

在 Taichi 中，可以使用SNode组成类似于 VDB 和 SPGrid 的数据结构，Taichi 的空间稀疏数据结构有以下优点

可以使用索引进行访问
迭代时自动并行
自动优化内存访问

TODO: Field 的稀疏性暂时用不到，之后用到了再学

坐标偏移 ¶

在定义 Taichi fiel 的时候可以使用坐标偏移，会改变 field 的边界，而不是从 0 开始

a = ti.Matrix.field(2, 2, dtype=ti.f32, shape=(32, 64), offset=(-16, 8))

a[-16, 8]  # lower left corner
a[16, 8]   # lower right corner
a[-16, 72]  # upper left corner
a[16, 72]   # upper right corner

坐标偏移参数offset的维度要和 Taichi field 的形状一致，否则会报错

与外部数据进行交互 ¶

NumPy ndarray¶

将一个 NumPy 的 array 导入到 Taichi scope 有两种方式：

创建一个 Taichi field f，shape 和 dtype 和要导入的 ndarray 一致，通过f.from_numpy(arr)将 ndarray 的值拷贝到f中

x = ti.field(float, shape=(3, 3))
a = np.arange(9).reshape(3, 3).astype(np.int32)
x.from_numpy(a)
print(x)

#[[0 1 2]
# [3 4 5]
# [6 7 8]]

arr = x.to_numpy()
#array([[0, 1, 2],
#       [3, 4, 5],
#       [6, 7, 8]], dtype=int32)

以参数的形式将 ndarray 传入 Taichi function 或者 kernel 中，使用ti.types.ndarray()作为类型提示，这种形式传入的是变量的引用，通常要对 ndarray 操作时才会用这种方式

import taichi as ti
import numpy as np
ti.init()

a = np.zeros((5, 5))

@ti.kernel
def test(a: ti.types.ndarray()):
    for i in range(a.shape[0]):  # a parallel for loop
        for j in range(a.shape[1]):
            a[i, j] = i + j

test(a)
print(a)

PyTorch tensor¶

和 NumPy ndarray 类似，可以用x.from_torch()和x.to_torch()导入和导出 tensor，但是在调用to_torch()的时候还需要指定一个参数device

x = ti.field(float, shape=(3, 3))
t = x.to_torch(device="cuda:0")
print(t.device) # device(type='cuda', index=0)

外部数据的形状 ¶

用 Taichi field 和外部数据交互时，形状匹配的规则如下：

标量 field

field = ti.field(int, shape=(256, 512))
array = field.to_numpy()

field.shape[0]=array.shape[0]
field.shape[1]=array.shape[1]

n 维向量 field

field = ti.Vector.field(3, int, shape=(256, 512))
array = field.to_numpy()

field.shape[0]=array.shape[0]
field.shape[1]=array.shape[1]
n=array.shape[1]

外部数据的形状为(*field_shape, n)

n-by-m (n x m) 矩阵 field

field = ti.Matrix.field(3, 4, ti.i32, shape=(256, 512))
array = field.to_numpy()
array.shape  # (256, 512, 3, 4)

外部数据的形状为(*field_shape, n, m)

Hint

此外，Taichi kernel 中的外部数据使用其物理内存布局进行索引，对于 PyTorch tensor 来说，在传入到 Taichi kernel 之前，必须是连续的 (needs to be made contiguous)

x = ti.field(dtype=int, shape=(3, 3))
y = torch.Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
y = y.T # Transposing the tensor returns a view of the tensor which is not contiguous

@ti.kernel
def copy_scalar(x: ti.template(), y: ti.types.ndarray()):
    for i, j in x:
        y[i, j] = x[i, j]

# copy(x, y) # error!
copy(x, y.clone()) # correct
copy(x, y.contiguous()) # correct

Data Containers¶

Taichi Fields¶

标量 fields ¶

声明 ¶

向量 fields ¶

声明 ¶

矩阵 fields ¶

声明 ¶

结构体 fields ¶

声明 ¶

组织高效的数据布局 ¶

AoS 和 SoA ¶

管理内存占用 ¶

Taichi Ndarray¶

Ndarray 的常用运算 ¶

空间稀疏数据结构 ¶

坐标偏移 ¶

与外部数据进行交互 ¶

NumPy ndarray¶

PyTorch tensor¶

外部数据的形状 ¶

评论