AICF Framework - Execution & Synchronization Model

1. High-level Execution Flow

python_framework_test.py
 └─ model.compile(...)
     ├─ trace_ir()           # IR 생성
     ├─ lower_to_backend()   # primitive ops + kernel_id
     ├─ build_binding_plan() # runtime tensor binding
     └─ CompiledTrainStep 생성

 └─ model.train_step(...)    # eager execution (safe)
 └─ model.capture(...)       # CUDA Graph capture
 └─ model.replay(n, sync=?)  # CUDA Graph replay
 └─ model.reset()            # graph reset

2. model.compile(...)

model.compile(
    optimizer=opt,
    warmup_inputs={"x": x, "t": t},
    warmup_runs=2,
)

역할

모델 + 옵티마이저를 고정된 실행 그래프로 컴파일
warmup_inputs 를 통해
- batch_size (B)
- feature dim (D)
- device / dtype 을 자동 추론
내부적으로
- IR trace
- lowering
- binding plan
- executor 준비
warmup_inputs 는 그래프 안정성을 보장하기 위한 필수 입력
warmup은 eager execution 으로 수행됨

3. model.train_step(...)

model.train_step({"x": x, "t": t})

실행 특성

eager execution
optimizer meta update 포함
내부적으로 동기화 수행
- 호출 이후 파라미터 값은 즉시 안전하게 read

목적

correcctness 검증
warmup 이후 실제 학습 스텝

4. model.capture(...)

model.capture({"x": x, "t": t})

실행 특성

CUDA Graph capture 수행
내부적으로
- dedicated CUDA stream 에서 실행
- 그래프 구조 + 커널 호출 패턴 고정

계약

capture 이후
- 연산 순서와 메모리 바인딩은 고정
- 값은 고정되지 않음

5. model.replay(n, sync=...)

model.replay(n=3, sync=False)  # default

기본 동작

CUDA Graph 를 비동기로 launch
매우 낮은 CPU overhead
성능 경로

동기화된 replay

model.replay(n=1, sync=True)

CUDA Graph launch 이후 syncchronize() 수행
replay 완료가 host 에 보장됨
테스트 / 디버그 / 결과 검증용

'AI Compiler framework' 카테고리의 다른 글

lower.py 의 기능 추가를 통한 AI compiler 로서의 역할 수행하도록 만들기~ (0)	2026.01.22
AICF 실행 흐름 ( nn - trace - IR - lower - plan - runtime - CUDA backend ) (0)	2026.01.22
테스트 코드를 통한 파일 실행 순서 확인 (0)	2026.01.21
그래프 최적화 기법 ( Fusion, Constant Folding, CSE ) (0)	2026.01.21
AICF v2 실행 과정 정리 (0)	2026.01.21

뜻 지, 가르칠 훈

AICF Framework - Execution & Synchronization Model

1. High-level Execution Flow

2. model.compile(...)

3. model.train_step(...)

실행 특성

4. model.capture(...)

실행 특성

계약

5. model.replay(n, sync=...)

기본 동작

동기화된 replay

'AI Compiler framework' 카테고리의 다른 글

티스토리툴바

AICF Framework - Execution & Synchronization Model

1. High-level Execution Flow

2. model.compile(...)

3. model.train_step(...)

실행 특성

4. model.capture(...)

실행 특성

계약

5. model.replay(n, sync=...)

기본 동작

동기화된 replay

'AI Compiler framework' 카테고리의 다른 글

'AI Compiler framework' Related Articles

티스토리툴바