'분류 전체보기' 카테고리의 글 목록 (14 Page)

728x90

분류 전체보기 256

matplotlib 또는 seaborn으로 subplot에 제목 추가하기

아래와 같이 set_title() 함수를 이용하면 된다. import numy as np import matplotlib.pyplot as plt x = np.linspace(-1,1,100) sin= np.sin(x) cos= np.cos(x) fig, ax = plt.subplots(1, 2) ax[0, 0].plot(x, sin) ax[0, 1].plot(x, cos) ax[0, 0].set_title("Title 1") ax[0, 1].set_title("Title 2") plt.show() 참고 Matplotlib에서 서브 플롯에 제목을 추가하는 방법 set_title() 및 title.set_text() 메소드를 사용하여 matplotlib의 서브 플롯에 제목을 추가 할 수 있습니다. www...

파이썬(Python) 2021.03.16

toPandas() 후 조회 시 index 2 is out of bounds for axis 0 with size 에러가 발생할 때

spark dataframe 또는 koalas를 이용해서 DF를 만들고 toPandas()를 이용해서 pandas DF로 변환해야하는 경우가 있다. 필자 같은 경우 DF로 heatmap을 만드는데 koalas DF에서 만들면 에러가 발생해서 pandas df로 변환했다. 문제는 변환 후 조회하면 "index 2 is out of bounds for axis 0 with size" 와 같은 에러가 발생했다. 구체적으로 DF에 NaN 값이 있었고, df.fillna(0)으로 NaN을 0값으로 변환한 경우에 에러가 발생했다. 이경우 toPandas() 코드 윗 부분에 아래와 같은 코드를 추가하면 된다. 파라미터를 -1로 하면 동일한 에러가 발생하는 것을 확인할 수 있다. pd.set_option('displa..

빅데이터(BigData)/Spark 2021.03.16

Koalas에서 Cannot combine the series or dataframe because it comes from a different dataframe 에러 발생 시

pyspark에서 koalas를 이용해서 DataFrame을 사용하는 작업에서 아래와 같은 에러를 만날 수 있다. Cannot combine the series or dataframe because it comes from a different dataframe. In order to allow this operation, enable 'compute.ops_on_diff_frames' option. 이 경우 에러 메시지에도 나와있는 옵션을 아래와 같이 추가하면 된다. from databricks.koalas.config import set_option, reset_option set_option("compute.ops_on_diff_frames", True) kdf['C'] = kser # Reset ..

빅데이터(BigData)/Spark 2021.03.15

Container killed by YARN for exceeding physical memory limits 에러 발생 시

spark submit 시 아래와 같은 에러가 발생할 수 있다. 원인은 executor에 할당된 메모리가 부족하다는 의미이다. 이 경우 executor의 할당된 메모리를 늘려주면 된다. cluster.YarnScheduler: Lost executor 20 on xxx: Container killed by YARN for exceeding physical memory limits. 6 GB of 6 GB physical memory used. Consider boosting spark.executor.memoryOverhead. 아래와 같이 spark-submit 시 아래 옵션의 값을 변경해준다. spark-submit --master yarn \ ..... --executor-memory 12g

빅데이터(BigData)/Spark 2021.03.15

url에서 원하는 파라미터 값 가져오기

아래와 같은 코드를 이용하면 원하는 파라미터 값을 가져올 수 있다. import urllib.parse as urlparse from urllib.parse import parse_qs url = 'http://abc.com/aaa?prd_no=000' parsed = urlparse.urlparse(url) print(parse_qs(parsed.query)['prd_no'][0])

파이썬(Python) 2021.03.12

python에서 pandas dataframe에서 컬럼의 text가 모두 안보이는 경우

컬럼 값이 길경우 ....으로 줄여서 나오는 경우가 있다. 이 때는 아래와 같은 옵션을 데이터 조회 전에 실행하면 된다. 두 번째 파라미터의 -1은 노출할 텍스트 수의 제한이 없음을 의미한다. pd.set_option('display.max_colwidth', -1) 참고 stackoverflow.com/questions/25351968/how-to-display-full-non-truncated-dataframe-information-in-html-when-convertin How to display full (non-truncated) dataframe information in html when converting from pandas dataframe to html? I converted a pa..

파이썬(Python)/Pandas 2021.03.12

Python의 DataFrame에서 모든 column이 나오게 하는 방법

판다스로 만든 컬럼이 많은 데이터프레임을 조회하면 모든 컬럼이 나오지 않는 경우가 있다. 이때는 아래와 같은 옵션을 데이터 조회 전에 실행하면 모든 컬럼을 볼 수 있다. 단 컬럼이 너무 많은 경우에는 에러가 발생한다. 두 번째 파라미터가 노출 컬럼 수이고, -1은 제한없이 보여준다는 의미다. pd.set_option('display.max_columns', -1) 참고 towardsdatascience.com/how-to-show-all-columns-rows-of-a-pandas-dataframe-c49d4507fcf

파이썬(Python)/Pandas 2021.03.12

koralas 또는 pandas 히스토그램으로 차트 그리기

루프를 돌면서 히스토그램을 그리려면 아래와 같이 하면 된다. from databricks import koalas as ks import matplotlib.pyplot as plt from IPython.display as display for i in range(1,7): print("cust_grp_no :{i}".format(i=i)) sql =""" select count(1) grp_{grp_no}_cnt from table_name a where 1=1 and part_date >= '20210301' and part_date

파이썬(Python) 2021.03.09

Tree View에서 task 박스에 볼드가 없는 경우

아래와 같이 초록색 task 박스에 볼드가 없는 경우가 있다. 이런 경우는 아래의 trigger dag 버튼을 눌러서 실행한 경우이다. 배치가 수행된 것이라고 오해하면 안된다.

빅데이터(BigData)/Airflow 2021.03.09

오늘 시간 확인하고 포멧 변경하기

아래와 같은 코드를 사용하면 현재 시간을 확인하고, 시간을 가감할 수 있다. from datetime import datetime, timedelta datetime.now() (datetime.now()+timedelta(minutes=1)).strftime("%Y%m%d%H%M")+"00"

파이썬(Python) 2021.03.08

1 ··· 11 12 13 14 15 16 17 ··· 26

Recommendation System, 빅데이터, Machine Learning, 추천시스템, pandas, 머신러닝, 추천 시스템, 파이썬, 부모 역할 훈련, python, Association Rule, PET, git, spark, airflow, 부모역할훈련, scikit-learn, pyspark, 맥북, 손자병법,

Today :
Yesterday :

728x90

프로도의 블로그

분류 전체보기 256

티스토리툴바

« 2025/02 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28