본문 바로가기

e-commerce

(8)
E-Commerce Part Ⅷ: 자연어분석(NLP) 응용 Natural Language Processing (NLP)¶ ⅰ. Importing Modules & Data Skimming¶ In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns import nltk In [2]: df = pd.read_csv("Data/yelp.csv", index_col=0) df.head() Out[2]: review_id user_id business_id stars date text useful funny cool 2967245 aMleVK0lQcOSNCs56_gSbg miHaLnLanDKfZqZHet0uWw Xp_cWXY5rxDLkX-wqUg-i..
E-Commerce Part Ⅶ: 시계열 분석 응용 Time Series Data Analysis¶ ⅰ. Importing Modules & Data Skimming¶ In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns In [2]: pd.options.display.max_columns=30 df = pd.read_excel("Data/Superstore.xls", index_col=0) df.head() Out[2]: Order ID Order Date Ship Date Ship Mode Customer ID Customer Name Segment Country City State Postal Code Region Produ..
E-Commerce Part Ⅵ: K Means Clustering 응용 K Means Clustering¶ ⅰ. 모듈 불러오기 & DATA 특성 파악¶ In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns In [2]: df = pd.read_csv("Data/Mall_Customers.csv", index_col = 0) df.head() Out[2]: Gender Age Annual Income (k$) Spending Score (1-100) CustomerID 1 Male 19 15 39 2 Male 21 15 81 3 Female 20 16 6 4 Female 23 16 77 5 Female 31 17 40 In [3]: df.info() ..
E-Commerce Part Ⅴ: Random Forest 모델 응용 Random Forest¶ ⅰ. 모듈 & DATA 특성 확인¶ In [1]: import numpy as pn import pandas as pd import matplotlib.pyplot as plt import seaborn as sns In [2]: member = pd.read_csv("Data/member.csv") trans = pd.read_csv("Data/transaction.csv") member.head() Out[2]: id recency zip_code is_referral channel conversion 0 906145 10 Surburban 0 Phone 0 1 184478 6 Rural 1 Web 0 2 394235 7 Surburban 1 Web 0 3 130152 9 ..
E-Commerce Part Ⅳ: Decision Tree 모델 응용 Decision Tree¶ ⅰ. 모듈 불러오기 & DATA 특성 확인¶ In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns In [2]: df = pd.read_csv("Data/galaxy.csv") df.info() RangeIndex: 1485 entries, 0 to 1484 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 BuyItNow 1485 non-null int64 1 startprice 1485 non-null float64 2 carri..
E-Commerce Part Ⅲ: KNN 모델 응용 K Nearest Neighbour¶ ⅰ. 모듈 불러오기 & DATA 확인¶ In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns In [2]: df = pd.read_csv("Data/churn.csv") df.info() RangeIndex: 7043 entries, 0 to 7042 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 customerID 7043 non-null object 1 gender 7043 non-null object 2 Seni..
E-Commerce Part Ⅱ: 로지스틱회귀분석 응용 로지스틱 회귀분석, Logistic Regression¶ ⅰ. 모듈 불러오기 & DATA 특성 확인¶ In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns In [2]: df = pd.read_csv("Data/advertising.csv") df.info() RangeIndex: 1000 entries, 0 to 999 Data columns (total 10 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Daily Time Spent on Site 1000 non-null float64 1 ..
E-Commerce Part Ⅰ: 선형회귀분석 응용 선형회귀분석, Linear Regression¶ ⅰ. 모듈 불러오기¶ In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns ⅱ. 데이터 특성 확인하기¶ 데이터의 Missing Value / Outlier 의 여부를 확인하고, 데이터 분석에 사용할 변수를 파악한다. In [2]: df = pd.read_csv("Data/ecommerce.csv") df.info() RangeIndex: 500 entries, 0 to 499 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ..