Dense 층의 call 연산을 보며 weight 가 어느 layer 에 저장해야 할지, 그 연산의 방법 과정을 어떻게 구현할지 고민

계산 그래프 형태로 모델의 정보를 저장하고 전달하자

어휘 분석기를 만든다고 생각하면 좋을 듯??

단순 산술식을 인식하는 어휘 분석기를 가정하고, 생각하면,

layer 의 종류(연산자, operator) 와 그 가중치(변수, variable) 라고 생각하자잇

딕셔너리 형태로 저장되어 있는 모델 정보를 하나씩 읽으면서 파싱 트리(계산 그래프)를 만드는 것,

지금 딕셔너리에 저장되어 있는 정보는 layer 의 종류와 그 weight , 기타 등등

model.add(Flatten(input_shape=(784,)))  # 이미지를 1차원으로 평탄화
model.add(Dense(128, activation='relu'))  # 첫 번째 Dense 층
model.add(Dense(64, activation='relu'))   # 두 번째 Dense 층
model.add(Dense(10, activation='softmax'))  # 세 번째 Dense 층, 출력층

위의 구조를 띄고 있을 경우 가중치는 그럼 어디에 저장을 해야 할까?

해당 layer 에 저장해야지~

keras dense layer 의 call 연산을 보면서 어떻게 구현할지 고민해보면

    def call(self, inputs, training=None):
        x = ops.matmul(inputs, self.kernel)
        if self.bias is not None:
            x = ops.add(x, self.bias)
        if self.activation is not None:
            x = self.activation(x)
        return x

Dense 의 call

class Matmul(Operation):
    def call(self, x1, x2):
        return backend.numpy.matmul(x1, x2)

    def compute_output_spec(self, x1, x2):
        x1_shape = getattr(x1, "shape", [])
        x2_shape = getattr(x2, "shape", [])
        output_shape = operation_utils.compute_matmul_output_shape(
            x1_shape, x2_shape
        )
        x1_sparse = getattr(x1, "sparse", True)
        x2_sparse = getattr(x2, "sparse", True)
        output_sparse = x1_sparse and x2_sparse
        x1_dtype = backend.standardize_dtype(getattr(x1, "dtype", type(x1)))
        x2_dtype = backend.standardize_dtype(getattr(x2, "dtype", type(x2)))
        if x1_dtype == "int8" and x2_dtype == "int8":
            dtype = "int32"
        else:
            dtype = dtypes.result_type(x1_dtype, x2_dtype)
        return KerasTensor(output_shape, dtype=dtype, sparse=output_sparse)

ops/numpy/numpy.py

def matmul(x1, x2):
    x1 = convert_to_tensor(x1)
    x2 = convert_to_tensor(x2)
    # When both x1 and x2 are of int8, we cast the outputs to int32 to align
    # with jax
    x1_dtype = standardize_dtype(x1.dtype)
    x2_dtype = standardize_dtype(x2.dtype)
    if x1_dtype == "int8" and x2_dtype == "int8":
        dtype = "int32"
    else:
        dtype = dtypes.result_type(x1.dtype, x2.dtype)
    x1 = x1.astype(dtype)
    x2 = x2.astype(dtype)
    return np.matmul(x1, x2).astype(dtype)

backend/numpy/numpy.py

'dev_AI_framework' 카테고리의 다른 글

loss_function 의 구현, (0)	2024.09.03
optimizer의 C++ Backend 구현, Adam - Adaptive momentum estimation (0)	2024.09.03
pyd, py, c++ 파일간 관계, import 과정의 차이 (0)	2024.09.02
vector_add.cpp, Python 벡엔드 호출의 구현 성공~ (0)	2024.09.02
cuda 를 이용한 GPU 연산의 조정 방법 (0)	2024.09.02

뜻 지, 가르칠 훈

Dense 층의 call 연산을 보며 weight 가 어느 layer 에 저장해야 할지, 그 연산의 방법 과정을 어떻게 구현할지 고민

'dev_AI_framework' 카테고리의 다른 글

티스토리툴바

Dense 층의 call 연산을 보며 weight 가 어느 layer 에 저장해야 할지, 그 연산의 방법 과정을 어떻게 구현할지 고민

'dev_AI_framework' 카테고리의 다른 글

'dev_AI_framework' Related Articles

티스토리툴바