전처리 관련 팁

KyungOok,Sung
Jun 28, 2021

전처리 팁

np.where(X_train[[‘petal_length’]] >=2, 1, 0)[문자열 숫자형 변환]
test['Sex'] = test['Sex'].map({'male':0, 'female':1})
test['Embarked'] = test['Embarked'].map({'S':0, 'C':1, 'Q':2})

[One-Hot Encoding]

[One-Hot Encoding]
test = pd.get_dummies(test, columns=['Embarked'], prefix='Embarked')
print(test.columns)
[out]
Index(['PassengerId', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked_0', 'Embarked_1', 'Embarked_2'], dtype='object')​

집합

df = pd.concat([df, df_null], axis=0) # union all

--

--