본문 바로가기

dev_AI_framework

간단한 모델의 학습으로 필요한 추가 사항의 확인

=== [TEST] Sequential 모델 학습 + 평가 (metrics 포함) ===
INFO:dev.models.sequential:✅ 레이어 추가됨: Flatten (input_shape=(1, 2, 2), output_shape=(1, 4))
INFO:dev.models.sequential:✅ 레이어 추가됨: Dense (input_shape=(1, 4), output_shape=(1, 2))
INFO:dev.models.sequential:✅ 레이어 추가됨: Activation (input_shape=(1, 2), output_shape=(1, 2))
[INFO] Dense 초기화 완료: weights=0.5, bias=0.1
INFO:dev.models.sequential:
=== [Epoch 1] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.001266, grad_in=0.001265, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.999071, grad_in=0.000928, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.648279
INFO:dev.models.sequential:
=== [Epoch 2] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.500000, grad_in=0.250000, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.999071, grad_in=0.000928, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.424165
INFO:dev.models.sequential:
=== [Epoch 3] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.500000, grad_in=0.250000, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.524972, grad_in=0.249376, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.110301
INFO:dev.models.sequential:
=== [Epoch 4] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.500000, grad_in=0.250000, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.999067, grad_in=0.000932, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.424160
INFO:dev.models.sequential:
=== [Epoch 5] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.500000, grad_in=0.250000, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.999954, grad_in=0.000046, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.424959
INFO:dev.models.sequential:
=== [Epoch 6] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.500000, grad_in=0.250000, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.999067, grad_in=0.000932, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.424160
INFO:dev.models.sequential:
=== [Epoch 7] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.500000, grad_in=0.250000, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.999954, grad_in=0.000046, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.424959
INFO:dev.models.sequential:
=== [Epoch 8] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.500000, grad_in=0.250000, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.999067, grad_in=0.000932, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.424160
INFO:dev.models.sequential:
=== [Epoch 9] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.500000, grad_in=0.250000, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.999954, grad_in=0.000046, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.424959
INFO:dev.models.sequential:
=== [Epoch 10] 시작 ===
=== [DEBUG] All backward ops ===
op_type=5 input=input param= output=flatten_2443705994368_out
op_type=0 input=flatten_2443705994368_out param=dense_2442952969200_W output=dense_2442952969200_linear
op_type=1 input=dense_2442952969200_linear param=dense_2442952969200_b output=dense_2442952969200_out
op_type=3 input=dense_2442952969200_out param= output=activation_2442954032624_out
[activation_backward] grad_out[0]=1.000000, out=0.500000, grad_in=0.250000, act_type=3
[activation_backward] grad_out[1]=1.000000, out=0.999067, grad_in=0.000932, act_type=3
??run_graph_backward finished.
INFO:dev.models.sequential:[Batch 완료] 손실: 0.424160
INFO:dev.models.sequential:📊 평가 손실: 0.424959, 메트릭(mse): 0.424959

📊 최종 평가 메트릭 (MSE): 0.424959
🔍 예측 출력:
 [[0.5        0.99995434]]

손실 값 변화 확인으로 3번째 epoch 까지는 손실이 감소하다가, 이후 약 0.42 의 지역적 최소점에 갇혀있음

학습에서 이러한 문제점을 해결하기 위해서는 추가로 어떤 기능이 필요한지

 

  • Optimizer 변경 - 구현되어 있음
  • Learning Rate 감소 - 구현되어 있음
  • Loss Function 변경 - 구현되어 있음
  • 기타 모델 구조 변경 및 가중치 초기화 방법의 변경 - 구현되어 있음