top of page

Get Yahoo Finance Data in Python

Hey everyone, welcome back to Intrendias! In order to perform stock market analysis, derive technical insights, or develop trading strategies, accessing reliable data is crucial. Today, let's explore how to retrieve data from Yahoo Finance for stock or cryptocurrency, including open, close, high, low, and volume data.


There are many ways to extract data from Yahoo Finance in python. We can create a custom python function or take advantage of open-source libraries; both methods take advantage of the yahoo finance API.


Get Yahoo Finance Data in Python with a Custom Function


import time
import datetime
import pandas as pd
from datetime import timedelta, datetime, date

#Here's the core function that fetches historical data for a given stock, crypto, or ETF. 
#We'll be using Yahoo Finance API to get the data. The function takes the ticker symbol, 
#start date, and end date as input parameters.

def get_data(ticker, sdDate, edDate):    
    # CONVERT STRING DATE TO UNIX TYPE ----------------------------------------
    # Because of indexing there is a cut off. Must add one day to the user inputted end date edDate
    formatDate = "%Y-%m-%d"
    edDate = datetime.strptime(edDate, formatDate) #convert string date into data type date
    edDate = edDate + timedelta(days=1) #add one day to date type 
    edDate = edDate.strftime('%Y-%m-%d') #convert back to string type date
    
    #"To work with the Yahoo Finance API, we need to convert our date strings to Unix timestamps. 
    # This ensures compatibility with the API's request format."
    
    #Convert to Unix Values
    period1 = int(time.mktime(time.strptime(sdDate,formatDate)))
    period2 = int(time.mktime(time.strptime(edDate,formatDate)))
    
    #"We also handle ticker format issues, making sure it's compatible with the API. 
    #Then, we build the query URL for fetching historical data."
    #issue with BRK.B must be BRK-B formate
    ticker = ticker.replace(".","-")
    querystring = f'https://query1.finance.yahoo.com/v7/finance/download/{ticker}?period1={period1}&period2={period2}&interval=1d&events=history&includeAdjustedClose=true'
    
    # Output the results of TEST -- OUTPUT WAS USED FOR TINKER OMIT 
    #output.insert(END, "\n >> Historic data for ", str(ticker), ' between the dates ', str(sdDate), ' and ', str(edDate), ' successfully retrieved. \n')
    daily = pd.read_csv(querystring)
    daily = daily.set_index(pd.DatetimeIndex(daily['Date'].tolist()))
    return daily 

#"To put it all into action, here's an example. We're fetching historical data for 
# BRK-B"
ticker = 'BRK-B'
df = get_data(ticker,'1996-05-09','2024-01-23')
df

To begin, let's discuss a couple of functions from the time module that will help us in converting a string date into a unix value:


period1 = int(time.mktime(time.strptime(sdDate,formatDate)))
period2 = int(time.mktime(time.strptime(edDate,formatDate)))

  1. time.strptime(sdDate, formatDate): This function is utilized for parsing a string representing a time based on a specified format. Here, it's employed to convert the start date (sdDate) from its string representation to a struct_time object using the designated format (formatDate).

  2. time.mktime(...): Another function from the time module, mktime(...), takes a struct_time object and transforms it into the corresponding Unix timestamp. A Unix timestamp denotes the number of seconds that have transpired since the epoch (January 1, 1970, 00:00:00 UTC).

  3. int(...): Lastly, the result obtained from time.mktime(...) is cast to an integer using the int() function. This conversion is necessary as the Unix timestamp is typically represented as a floating-point number. However, in certain contexts, such as API requests or other time-related operations, an integer representation is preferred.

In summary, the line of code mentioned converts the start date (sdDate) from a human-readable string format to a Unix timestamp (period1), rendering it suitable for utilization in API requests or any other operations that necessitate a numerical representation of time.


Leveraging f string here enables us to dynamically alter the querystring value which calls the Yahoo Finance API and requests a csv file holding the historic pricing data of the stock, crypto, forex, or etf.


Wait... Why Do We Add a Day?

    formatDate = "%Y-%m-%d"
    edDate = datetime.strptime(edDate, formatDate) #convert string date into data type date
    edDate = edDate + timedelta(days=1) #add one day to date type 
    edDate = edDate.strftime('%Y-%m-%d') #convert back to string type date

Here we convert the string date into a date data type only to add a day using timedelta() because when requesting an interval of data, the end date is one day behind. If I request 2024-01-09 then the returned final date of data is 2024-01-08. To ensure 100% of the requested dates of data are returned we add a day to better the user experience.



Intrendias
Intrendias


yfinance is not affiliated, endorsed, or vetted by Yahoo, Inc. It's an open-source tool that uses Yahoo's publicly available APIs and is intended for research and educational purposes.


import yfinance as yf
df = yf.download('BRK-B', '1996-05-09', '2024-01-23', '1d')
df

Comments


bottom of page