>RE::VISION CRM

Python데이터분석

[Python] mtcars mtcars.csv sample dataset

YONG_X 2020. 4. 9. 16:14
mtcars {datasets}R Documentation

Motor Trend Car Road Tests

Description

The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

Usage

mtcars

Format

A data frame with 32 observations on 11 (numeric) variables.

[, 1]mpgMiles/(US) gallon
[, 2]cylNumber of cylinders
[, 3]dispDisplacement (cu.in.)
[, 4]hpGross horsepower
[, 5]dratRear axle ratio
[, 6]wtWeight (1000 lbs)
[, 7]qsec1/4 mile time
[, 8]vsEngine (0 = V-shaped, 1 = straight)
[, 9]amTransmission (0 = automatic, 1 = manual)
[,10]gearNumber of forward gears
[,11]carbNumber of carburetors

Note

Henderson and Velleman (1981) comment in a footnote to Table 1: ‘Hocking [original transcriber]'s noncrucial coding of the Mazda's rotary engine as a straight six-cylinder engine and the Porsche's flat engine as a V engine, as well as the inclusion of the diesel Mercedes 240D, have been retained to enable direct comparisons to be made with previous analyses.’

Source

Henderson and Velleman (1981), Building multiple regression models interactively. Biometrics37, 391–411.




file ::


mtcars.csv







# 기본 라이브러리 로딩 및 사용자 정의함수 정의


import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

from numpy.polynomial.polynomial import polyfit

import matplotlib.style as style 

from IPython.display import Image


import warnings

warnings.filterwarnings('ignore')



# define universally useful UDFs


# define random jitter

def rjitt(arr):

    stdev = .01*(max(arr)-min(arr))

    return arr + np.random.randn(len(arr)) * stdev




# load mtcars.csv from a mirrored blog posting

mtcars = pd.read_csv('https://t1.daumcdn.net/cfile/blog/99F8633E5E8ECB130D?download', index_col=0)


mtcars.head()



# create a sample scatterplot 

from sklearn.metrics import r2_score


nobs = len(mtcars)

varx = mtcars.wt

vary = mtcars.hp

colors1 = ['red' if x==1 else 'blue' for x in mtcars.am]

size1 = mtcars.mpg / mtcars.mpg.max() *70


plt.figure(figsize = (8,4), dpi=120)

plt.scatter(rjitt(varx), rjitt(vary), 

            alpha=0.3, color=colors1, s=size1)

plt.xlabel('wt')

plt.ylabel('hp')

plt.suptitle('mpg vs. wt, hp and am', size=12)

plt.title('- size: mpg ; red: am')

plt.plot(np.unique(varx), np.poly1d(np.polyfit(varx, vary, 1))(np.unique(varx)), 

         color='grey', linewidth=2, linestyle='--')

plt.plot(np.unique(varx), np.poly1d(np.polyfit(varx, vary, 3))(np.unique(varx)), 

         color='silver', linewidth=6, alpha=0.4, )

rsqrd = r2_score(vary, np.poly1d(np.polyfit(varx, vary, 1))(varx))

rsqrdtxt = '* r sqrd = ' + str(round(rsqrd,3))

crrval = np.corrcoef(varx,vary)[0,1]

crrtxt = '** correlation = ' + str(round(crrval,3))

plt.text(varx.max()*0.7, vary.max()*0.3, rsqrdtxt)

plt.text(varx.max()*0.7, vary.max()*0.25, crrtxt)


for x in list(mtcars.index):

    if x in np.random.choice(mtcars.index, 10):

        plt.text(varx[x]-0.2, vary[x]+4, x, size=7)

plt.show()











mtcars  mtcars.csv  mtcars  mtcars.csv  mtcars  mtcars.csv  mtcars  mtcars.csv  mtcars  mtcars.csv  mtcars  mtcars.csv  mtcars  mtcars.csv  mtcars  mtcars.csv  mtcars  mtcars.csv  mtcars  mtcars.csv  mtcars  mtcars.csv   

전용준 . 머신러닝 . 전용준 . 머신러닝 . 전용준 . 머신러닝 . 

mtcars.csv
0.0MB