sklearn : _check_sample_weight (가중치 검증)

def _check_sample_weight(
    sample_weight, X, dtype=None, copy=False, only_non_negative=False
):
    """Validate sample weights.

    Note that passing sample_weight=None will output an array of ones.
    Therefore, in some cases, you may want to protect the call with:
    if sample_weight is not None:
        sample_weight = _check_sample_weight(...)

    Parameters
    ----------
    sample_weight : {ndarray, Number or None}, shape (n_samples,)
        Input sample weights.

    X : {ndarray, list, sparse matrix}
        Input data.

    only_non_negative : bool, default=False,
        Whether or not the weights are expected to be non-negative.

        .. versionadded:: 1.0

    dtype : dtype, default=None
        dtype of the validated `sample_weight`.
        If None, and the input `sample_weight` is an array, the dtype of the
        input is preserved; otherwise an array with the default numpy dtype
        is be allocated.  If `dtype` is not one of `float32`, `float64`,
        `None`, the output will be of dtype `float64`.

    copy : bool, default=False
        If True, a copy of sample_weight will be created.

    Returns
    -------
    sample_weight : ndarray of shape (n_samples,)
        Validated sample weight. It is guaranteed to be "C" contiguous.
    """
    n_samples = _num_samples(X)

    if dtype is not None and dtype not in [np.float32, np.float64]:
        dtype = np.float64

    if sample_weight is None:
        sample_weight = np.ones(n_samples, dtype=dtype)
    elif isinstance(sample_weight, numbers.Number):
        sample_weight = np.full(n_samples, sample_weight, dtype=dtype)
    else:
        if dtype is None:
            dtype = [np.float64, np.float32]
        sample_weight = check_array(
            sample_weight,
            accept_sparse=False,
            ensure_2d=False,
            dtype=dtype,
            order="C",
            copy=copy,
            input_name="sample_weight",
        )
        if sample_weight.ndim != 1:
            raise ValueError("Sample weights must be 1D array or scalar")

        if sample_weight.shape != (n_samples,):
            raise ValueError(
                "sample_weight.shape == {}, expected {}!".format(
                    sample_weight.shape, (n_samples,)
                )
            )

    if only_non_negative:
        check_non_negative(sample_weight, "`sample_weight`")

    return sample_weight

샘플 가중치 입력을 검증하고 처리하여, 모델 학습 과정에서 사용할 수 있는 형태로 변환한다.

sample_weight 가 유효한지 확인하고, 필요 시 기본값으로 설정하거나 데이터를 변환한다.

파라미터

sample_weight ({ndarray, Number, or None}), shape (n_samples,)
- 샘플 가중치로 사용될 입력 데이터, None, 숫자, 1차원 배열
- None 인 경우 모든 샘플에 대해 동일한 가중치 1 할당
- 숫자의 경우 모든 샘플이 해당 숫자 값으로 가중치 사용
- 배열의 경우 각 샘플에 대해 개별적인 가중치 적용
x {ndarray, list, sparse matrix}
- 입력 데이터, 가중치 배열의 길이가 데이터의 샘플 수와 일치하는 데 사용
dtype (dtype, default = None)
- 검증된 sample_weight 의 데이터 타입
- None 일 경우 입력된 sample_weight 의 타입이 유지, 지정되지 않으면 기본 numpy 데이터 타입 할당
- float 64 로 설정
copy (bool, default = False)
- True 인 경우 복사본 생성
only_non_negative (bool, default=False)
- True 일 경우 가중치가 0 이상임을 보장
- False 일 경우 가중치가 음수 일 수 있다.

샘플 수 확인
- n_samples : _num_samples(X) 를 통한 X 의 샘플 수 확인
데이터 타입 설정
sample_weight 가 None 인 경우
sample_weight 가 숫자인 경우
sample_weight 가 배열인 경우
음수 가중치 검사

'dev_AI_framework' 카테고리의 다른 글

sklearn : _rescale_data ( sparse_matrix, inplace ) (0)	2024.08.09
sklearn : _preprocess_data ( 전처리 및 중앙값 제거 ) (0)	2024.08.09
sklearn : _validate_date ( 데이터 검증 ) (0)	2024.08.09
AI FrameWork 요구사항 분석 및 계획 수립 (0)	2024.08.07
ML Framework 의 C++ 구현 , 방법과 그 장점 (0)	2024.07.31

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

뜻 지, 가르칠 훈

sklearn : _check_sample_weight (가중치 검증)

'dev_AI_framework' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

sklearn : _check_sample_weight (가중치 검증)

'dev_AI_framework' 카테고리의 다른 글

'dev_AI_framework' Related Articles

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역