관리 메뉴

SW

pandas 본문

대학교/Data

pandas

SWKo 2020. 1. 30. 20:25
20200130practice2
In [64]:
from IPython.core.display import display, HTML
display(HTML("<style> .container{width:90% !important;}</style>"))
In [40]:
# 1. numpy : 수리, 통계
# 2. pandas : 데이터 분석을 위해 필요한 기능을 더함
# - ndarray : 다차원 배열 (nparray)
# - Series : 열 column
# - DataFrame : 행열 - table

# 3. matplotlib : 기본 시각화 - matlab
# 4. seaborn + ..... : 더 디자인이 나음
In [41]:
# 기본 라이브러리 설치 & 불러오기
In [42]:
!pip3 install requests bs4 numpy pandas matplotlib seaborn
Requirement already satisfied: requests in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (2.22.0)
Requirement already satisfied: bs4 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (0.0.1)
Requirement already satisfied: numpy in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (1.18.1)
Requirement already satisfied: pandas in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (0.25.3)
Requirement already satisfied: matplotlib in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (3.1.2)
Requirement already satisfied: seaborn in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (0.10.0)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from requests) (1.25.7)
Requirement already satisfied: idna<2.9,>=2.5 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from requests) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from requests) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from requests) (2019.11.28)
Requirement already satisfied: beautifulsoup4 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from bs4) (4.8.2)
Requirement already satisfied: pytz>=2017.2 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from pandas) (2019.3)
Requirement already satisfied: python-dateutil>=2.6.1 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from pandas) (2.8.1)
Requirement already satisfied: cycler>=0.10 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from matplotlib) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from matplotlib) (1.1.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from matplotlib) (2.4.6)
Requirement already satisfied: scipy>=1.0.1 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from seaborn) (1.4.1)
Requirement already satisfied: soupsieve>=1.2 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from beautifulsoup4->bs4) (1.9.5)
Requirement already satisfied: six>=1.5 in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from python-dateutil>=2.6.1->pandas) (1.14.0)
Requirement already satisfied: setuptools in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (from kiwisolver>=1.0.1->matplotlib) (41.2.0)
In [43]:
!pip install xlrd
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Requirement already satisfied: xlrd in /Users/kosangwon/.conda/envs/practice2/lib/python3.7/site-packages (1.2.0)
In [44]:
import numpy as np
In [45]:
import pandas as pd
In [46]:
import matplotlib.pyplot as plt
In [47]:
# 불필요한 경고 메세지 끄기
import warnings
warnings.filterwarnings('ignore')
In [48]:
# 폰트 목록 확인하기
from matplotlib import font_manager
[f.name for f in font_manager.fontManager.ttflist]
Out[48]:
['DejaVu Serif Display',
 'STIXGeneral',
 'cmb10',
 'cmr10',
 'DejaVu Sans Mono',
 'STIXSizeThreeSym',
 'DejaVu Sans',
 'cmss10',
 'STIXNonUnicode',
 'STIXSizeTwoSym',
 'STIXGeneral',
 'DejaVu Sans',
 'DejaVu Sans Mono',
 'STIXSizeFiveSym',
 'STIXGeneral',
 'STIXNonUnicode',
 'STIXGeneral',
 'cmtt10',
 'STIXSizeFourSym',
 'STIXNonUnicode',
 'DejaVu Sans Mono',
 'STIXSizeFourSym',
 'DejaVu Sans',
 'cmsy10',
 'STIXSizeOneSym',
 'STIXNonUnicode',
 'STIXSizeThreeSym',
 'DejaVu Serif',
 'DejaVu Sans',
 'STIXSizeOneSym',
 'DejaVu Serif',
 'DejaVu Sans Mono',
 'DejaVu Sans Display',
 'DejaVu Serif',
 'DejaVu Serif',
 'STIXSizeTwoSym',
 'cmmi10',
 'cmex10',
 'Noto Sans Cham',
 'Papyrus',
 'Verdana',
 'Noto Sans Shavian',
 'STIXNonUnicode',
 'Zapf Dingbats',
 'Georgia',
 'Athelas',
 'STIXGeneral',
 'Apple Braille',
 'InaiMathi',
 'Noto Sans Tai Viet',
 'STIXIntegralsD',
 'Chalkboard',
 'Shree Devanagari 714',
 'Hiragino Sans GB',
 'STIXIntegralsUpSm',
 'Muna',
 'Noto Sans Lydian',
 'Comic Sans MS',
 '.SF NS Display Condensed',
 'Menlo',
 'Noto Sans Kharoshthi',
 'Comic Sans MS',
 'Telugu MN',
 'Noto Sans Hanunoo',
 'Noto Sans Glagolitic',
 'Oriya MN',
 '.SF NS Display Condensed',
 '.Arabic UI Display',
 'Courier New',
 '.SF NS Mono',
 'Trebuchet MS',
 'Gurmukhi Sangam MN',
 'Devanagari Sangam MN',
 'STIXIntegralsD',
 'Gurmukhi MN',
 'Iowan Old Style',
 'Trebuchet MS',
 'Hiragino Sans',
 'STIXNonUnicode',
 'PT Mono',
 'Noto Sans Imperial Aramaic',
 'Apple Braille',
 'Raanana',
 'Noto Sans Mandaic',
 'Noto Sans Brahmi',
 'Thonburi',
 'Times New Roman',
 'Noto Sans Tagbanwa',
 'Arial Narrow',
 '.SF Compact Text',
 'Kohinoor Bangla',
 'Snell Roundhand',
 'Euphemia UCAS',
 'Beirut',
 'Devanagari MT',
 'Helvetica',
 '.Aqua Kana',
 '.Keyboard',
 'STIXNonUnicode',
 'Times New Roman',
 'Noto Sans Kayah Li',
 'Noto Sans Linear B',
 'Noto Sans Syloti Nagri',
 'Tahoma',
 'STIXIntegralsUp',
 'Arial Black',
 'Hiragino Sans',
 'Noto Sans Tai Tham',
 'Diwan Thuluth',
 'Mishafi',
 'Avenir Next Condensed',
 'Noto Sans Ol Chiki',
 'Noto Sans Old South Arabian',
 'STIXGeneral',
 'Kohinoor Gujarati',
 'Noto Sans Old Italic',
 'Trattatello',
 'Noto Sans Samaritan',
 'Noto Nastaliq Urdu',
 'Wingdings',
 'Noto Sans Avestan',
 'STIXSizeTwoSym',
 'Verdana',
 'STIXIntegralsUpD',
 'Noto Sans Kaithi',
 'Arial',
 'Noto Sans Osmanya',
 '.SF Compact Rounded',
 'KufiStandardGK',
 'Chalkboard SE',
 'Oriya Sangam MN',
 'Times New Roman',
 'Noto Sans Tagalog',
 'Hiragino Maru Gothic Pro',
 'Verdana',
 'Luminari',
 '.SF NS Rounded',
 'Noto Sans Myanmar',
 'Noto Sans Buhid',
 'Seravek',
 'Gujarati MT',
 'Heiti TC',
 'Bodoni 72 Oldstyle',
 'Noto Sans Inscriptional Pahlavi',
 'Lao Sangam MN',
 '.SF NS Text Condensed',
 'Hiragino Sans',
 '.SF NS Display Condensed',
 'Noto Sans Runic',
 'Kohinoor Telugu',
 '.New York',
 'Arial',
 'Arial Unicode MS',
 'Hiragino Sans',
 'Bradley Hand',
 'Trebuchet MS',
 'STIXSizeThreeSym',
 'Gill Sans',
 'Noto Serif Balinese',
 'Noto Sans Cypriot',
 '.LastResort',
 'Hoefler Text',
 'Mukta Mahee',
 'Bangla MN',
 'Verdana',
 'Mshtakan',
 'Microsoft Sans Serif',
 'Sana',
 'STIXSizeFourSym',
 'Apple Chancery',
 'STIXIntegralsUpSm',
 'STIXGeneral',
 'STIXNonUnicode',
 'AppleGothic',
 'Gujarati Sangam MN',
 'Futura',
 'Myanmar MN',
 'Bodoni 72',
 'Al Tarikh',
 'PT Serif',
 'Hiragino Sans',
 '.SF NS Text Condensed',
 'Noto Sans Bamum',
 'Kokonor',
 'Times New Roman',
 'Noto Sans Inscriptional Parthian',
 '.Arabic UI Text',
 'Baghdad',
 'Tamil MN',
 'Avenir',
 'AppleMyungjo',
 'Ayuthaya',
 'Wingdings 2',
 'SignPainter',
 'Georgia',
 'System Font',
 'Diwan Kufi',
 'PingFang HK',
 'System Font',
 'Noto Sans Egyptian Hieroglyphs',
 'Phosphate',
 'Silom',
 'Kohinoor Devanagari',
 'Baskerville',
 'Hiragino Sans',
 'Courier New',
 'STIXIntegralsUpD',
 'Arial Hebrew',
 'Impact',
 'Noto Sans PhagsPa',
 'Noto Sans Chakma',
 '.SF NS Text Condensed',
 'Bangla Sangam MN',
 'American Typewriter',
 'Arial',
 'DecoType Naskh',
 'Wingdings 3',
 '.SF NS Display Condensed',
 'Skia',
 'Hiragino Sans',
 'Noto Sans Ogham',
 'Noto Sans New Tai Lue',
 'Noto Sans Meetei Mayek',
 'STIXGeneral',
 '.SF Compact Text',
 'Brush Script MT',
 'Plantagenet Cherokee',
 'Noto Sans Lisu',
 'Didot',
 'Apple Braille',
 'Apple SD Gothic Neo',
 'Gurmukhi MT',
 'Al Nile',
 'Damascus',
 'Helvetica Neue',
 'Telugu Sangam MN',
 'Apple Braille',
 'Noto Sans Oriya',
 'Noto Sans Tai Le',
 'Heiti TC',
 '.SF NS Text Condensed',
 'STIXIntegralsUp',
 'STIXSizeOneSym',
 'Noto Sans Thaana',
 'Kefa',
 'STIXVariants',
 'Lucida Grande',
 'Noto Sans Phoenician',
 '.SF NS Display Condensed',
 'Hiragino Sans',
 'Arial',
 'Marion',
 'Palatino',
 'Big Caslon',
 'STIXIntegralsSm',
 'Noto Sans Lepcha',
 'Mishafi Gold',
 '.SF NS Text Condensed',
 'Sinhala MN',
 'Charter',
 'Chalkduster',
 'STIXIntegralsSm',
 'Hoefler Text',
 'Apple Symbols',
 'Noto Sans Mongolian',
 'Arial Narrow',
 'Georgia',
 'Sathu',
 'Noto Sans Cuneiform',
 'DIN Condensed',
 'Nadeem',
 'Times',
 'Andale Mono',
 'PT Sans',
 'Kannada Sangam MN',
 'Zapfino',
 'Noto Sans Old Turkic',
 'Noto Sans Gothic',
 'Optima',
 'Sukhumvit Set',
 'Rockwell',
 'Noto Serif Myanmar',
 'Georgia',
 'DIN Alternate',
 'Arial Rounded MT Bold',
 'Noto Sans Yi',
 'Noto Sans Syriac',
 'Tahoma',
 'Savoye LET',
 '.Helvetica Neue DeskInterface',
 'Tamil Sangam MN',
 'STIXSizeFiveSym',
 'Bodoni 72 Smallcaps',
 'Courier New',
 'PT Serif Caption',
 '.New York',
 '.SF NS Text Condensed',
 'Marker Felt',
 'Noto Sans Javanese',
 'Noto Sans Old Persian',
 'Noto Sans NKo',
 'Noto Sans Coptic',
 'STIXSizeOneSym',
 'Noto Sans Batak',
 '.SF NS Display Condensed',
 'Krungthep',
 '.SF NS Display Condensed',
 'Malayalam Sangam MN',
 'Khmer Sangam MN',
 'Farisi',
 'Hiragino Sans',
 'Superclarendon',
 'Noto Sans Armenian',
 'Malayalam MN',
 'Kailasa',
 'Bodoni Ornaments',
 'Arial Narrow',
 'Hiragino Mincho ProN',
 'New Peninim MT',
 'Apple Braille',
 'Webdings',
 'Farah',
 'STIXSizeFourSym',
 'Al Bayan',
 'Geeza Pro',
 'Noto Sans Buginese',
 'Herculanum',
 'Courier New',
 'ITF Devanagari',
 'Songti SC',
 'Noto Sans Carian',
 'Avenir Next',
 'Lao MN',
 'Kannada MN',
 'Khmer MN',
 'Noto Sans Saurashtra',
 'Waseem',
 'Noteworthy',
 '.SF NS Display Condensed',
 'Noto Sans Ugaritic',
 'Noto Sans Rejang',
 'Noto Sans Lycian',
 'Noto Sans Limbu',
 'Copperplate',
 'Cochin',
 'Corsiva Hebrew',
 '.SF Compact Display',
 'Myanmar Sangam MN',
 'STIXSizeTwoSym',
 'Noto Sans Sundanese',
 'Noto Sans Kannada',
 'Hiragino Sans',
 'STIXSizeThreeSym',
 'Noto Sans Vai',
 'Noto Sans Tifinagh',
 'Symbol',
 '.SF NS Display Condensed',
 'Arial Narrow',
 'Galvji',
 'STIXVariants',
 'Sinhala Sangam MN',
 '.SF NS Mono',
 'Trebuchet MS']
In [49]:
# 폰트 변경
plt.rc('font', family='Gulim')
In [56]:
# 데이터를 불러오면 변수에 꼭 저장
data = pd.read_excel('navernews.xlsx')
In [57]:
# 데이터 확인
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 5 columns):
번호     10 non-null int64
제목     10 non-null object
URL    10 non-null object
내용     10 non-null object
언론사    10 non-null object
dtypes: int64(1), object(4)
memory usage: 528.0+ bytes
In [58]:
# 데이터에 대한 간략한 분석 - 숫자 데이터일 때 효과가 훨씬 높음
data.describe()
Out[58]:
번호
count 10.00000
mean 5.50000
std 3.02765
min 1.00000
25% 3.25000
50% 5.50000
75% 7.75000
max 10.00000
In [62]:
data.head() # 위에서 5줄
#data.tail() # 밑에서 5줄
Out[62]:
번호 제목 URL 내용 언론사
0 1 맨시티, 맨유 제치고 3시즌 연속 카라바오컵 결승행 http://yna.kr/AKR20200130024000007?did=1195m 준결승 2차전에서 맨시티, 맨유에 0-1 패배…1, 2차전 합계 3-2로 결승행 맨... 연합뉴스
1 2 [카라바오컵] 맨유, 맨시티 1-0 격파…결승행은 좌절 http://sports.khan.co.kr/news/sk_index.html?ar... 세르히오 아구에로에게 골을 내줬지만 오프사이드가 선언돼 다시 한숨 돌렸다. 경기 종... 스포츠경향
2 3 [스경X라인업] 맨시티-맨유 명단 공개…최정예 총력전 http://sports.khan.co.kr/news/sk_index.html?ar... 맨시티와 맨유는 30일 오전 4시 45분(한국시간) 영국 맨체스터의 에티하드 스타디... 스포츠경향
3 4 맨유 잡은 맨시티, 역대 두 번째 리그컵 3연패? http://www.dailian.co.kr/news/view/864805?sc=N... 하지만 지난 맨유 원정서 3-1 승리했던 맨시티는 1~2차전 합계 3-2로 지역 라... 데일리안
4 5 맨유 솔샤르 감독 “6주 사이에 맨시티 원정 2승, 선수들 자랑스럽다” http://sports.donga.com/3/all/20200130/99457429/2 잉글랜드 프리미어리그(EPL) 맨체스터 유나이티드(이하 맨유) 올레 군나르 솔샤르 ... 스포츠동아
In [63]:
data.take([1,3,5]) # 원하는 행만 선택해서 출력
Out[63]:
번호 제목 URL 내용 언론사
1 2 [카라바오컵] 맨유, 맨시티 1-0 격파…결승행은 좌절 http://sports.khan.co.kr/news/sk_index.html?ar... 세르히오 아구에로에게 골을 내줬지만 오프사이드가 선언돼 다시 한숨 돌렸다. 경기 종... 스포츠경향
3 4 맨유 잡은 맨시티, 역대 두 번째 리그컵 3연패? http://www.dailian.co.kr/news/view/864805?sc=N... 하지만 지난 맨유 원정서 3-1 승리했던 맨시티는 1~2차전 합계 3-2로 지역 라... 데일리안
5 6 맨시티팬들, 맨유팬 향해 '뮌헨참사' 조롱 논란(英BBC) http://moneys.mt.co.kr/news/mwView.php?no=2020... 이를 라이벌인 맨시티 팬들이 조롱하자 일부 맨유 팬들은 참지 못하고 경기가 끝난 뒤... 머니S

'대학교 > Data' 카테고리의 다른 글

matplotlib  (0) 2020.02.01
Kaggle [Titanic Data Analysis]  (0) 2020.02.01
json  (0) 2020.01.30
openpyxl  (0) 2020.01.30
Beautifulsoup  (0) 2020.01.30
Comments