mtcars {datasets} | R Documentation |
Motor Trend Car Road Tests
Description
The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).
Usage
mtcars
Format
A data frame with 32 observations on 11 (numeric) variables.
[, 1] | mpg | Miles/(US) gallon |
[, 2] | cyl | Number of cylinders |
[, 3] | disp | Displacement (cu.in.) |
[, 4] | hp | Gross horsepower |
[, 5] | drat | Rear axle ratio |
[, 6] | wt | Weight (1000 lbs) |
[, 7] | qsec | 1/4 mile time |
[, 8] | vs | Engine (0 = V-shaped, 1 = straight) |
[, 9] | am | Transmission (0 = automatic, 1 = manual) |
[,10] | gear | Number of forward gears |
[,11] | carb | Number of carburetors |
Note
Henderson and Velleman (1981) comment in a footnote to Table 1: ‘Hocking [original transcriber]'s noncrucial coding of the Mazda's rotary engine as a straight six-cylinder engine and the Porsche's flat engine as a V engine, as well as the inclusion of the diesel Mercedes 240D, have been retained to enable direct comparisons to be made with previous analyses.’
Source
Henderson and Velleman (1981), Building multiple regression models interactively. Biometrics, 37, 391–411.
file ::
# 기본 라이브러리 로딩 및 사용자 정의함수 정의
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from numpy.polynomial.polynomial import polyfit
import matplotlib.style as style
from IPython.display import Image
import warnings
warnings.filterwarnings('ignore')
# define universally useful UDFs
# define random jitter
def rjitt(arr):
stdev = .01*(max(arr)-min(arr))
return arr + np.random.randn(len(arr)) * stdev
# load mtcars.csv from a mirrored blog posting
mtcars = pd.read_csv('https://t1.daumcdn.net/cfile/blog/99F8633E5E8ECB130D?download', index_col=0)
mtcars.head()
# create a sample scatterplot
from sklearn.metrics import r2_score
nobs = len(mtcars)
varx = mtcars.wt
vary = mtcars.hp
colors1 = ['red' if x==1 else 'blue' for x in mtcars.am]
size1 = mtcars.mpg / mtcars.mpg.max() *70
plt.figure(figsize = (8,4), dpi=120)
plt.scatter(rjitt(varx), rjitt(vary),
alpha=0.3, color=colors1, s=size1)
plt.xlabel('wt')
plt.ylabel('hp')
plt.suptitle('mpg vs. wt, hp and am', size=12)
plt.title('- size: mpg ; red: am')
plt.plot(np.unique(varx), np.poly1d(np.polyfit(varx, vary, 1))(np.unique(varx)),
color='grey', linewidth=2, linestyle='--')
plt.plot(np.unique(varx), np.poly1d(np.polyfit(varx, vary, 3))(np.unique(varx)),
color='silver', linewidth=6, alpha=0.4, )
rsqrd = r2_score(vary, np.poly1d(np.polyfit(varx, vary, 1))(varx))
rsqrdtxt = '* r sqrd = ' + str(round(rsqrd,3))
crrval = np.corrcoef(varx,vary)[0,1]
crrtxt = '** correlation = ' + str(round(crrval,3))
plt.text(varx.max()*0.7, vary.max()*0.3, rsqrdtxt)
plt.text(varx.max()*0.7, vary.max()*0.25, crrtxt)
for x in list(mtcars.index):
if x in np.random.choice(mtcars.index, 10):
plt.text(varx[x]-0.2, vary[x]+4, x, size=7)
plt.show()
mtcars mtcars.csv mtcars mtcars.csv mtcars mtcars.csv mtcars mtcars.csv mtcars mtcars.csv mtcars mtcars.csv mtcars mtcars.csv mtcars mtcars.csv mtcars mtcars.csv mtcars mtcars.csv mtcars mtcars.csv
전용준 . 머신러닝 . 전용준 . 머신러닝 . 전용준 . 머신러닝 .
'Python데이터분석' 카테고리의 다른 글
[파이썬] kmeans scatter plot: plot different colors per cluster (0) | 2020.12.22 |
---|---|
[Python] 파이썬 데이터 처리 기초 연습문제 [1] (0) | 2020.04.27 |
[kma_recsys_2020] 데이터 기반의 추천서비스_ 전용준_리비젼 (0) | 2020.03.31 |
[디지털마케팅서밋] 디지털 마케터를 위한 탐색적 데이터 분석 Workshop (0) | 2020.02.10 |
[강의] 데이터 기반의 콘텐츠 추천 서비스 - 전용준 리비젼컨설팅 (0) | 2020.01.14 |