Compare commits
32 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
403607c721 | ||
|
|
604f74ea5e | ||
|
|
95b59a7d53 | ||
|
|
1ddc38c04f | ||
|
|
fcab4ea37e | ||
|
|
d40ad6282a | ||
|
|
c95c02fdfc | ||
|
|
5fb9cab817 | ||
|
|
1c39ca4762 | ||
|
|
27868a4677 | ||
|
|
0da75493dc | ||
|
|
14c6d05854 | ||
|
|
8e369a2736 | ||
|
|
e09f77eb06 | ||
|
|
7db798e900 | ||
|
|
3b7758b3ab | ||
|
|
b729bcb1df | ||
|
|
6d7c5b6f4c | ||
|
|
45568fd765 | ||
|
|
fc020d953a | ||
|
|
bc645bb7dd | ||
|
|
74198aeed4 | ||
|
|
890922f68b | ||
|
|
44ef4a73ac | ||
|
|
c745df183a | ||
|
|
0e069dbbff | ||
|
|
25144351a4 | ||
|
|
3dd7e40df2 | ||
|
|
5bbcedfe2e | ||
|
|
fc049e1e0d | ||
|
|
00a44a8132 | ||
|
|
0be2ce1fd5 |
35
.github/workflows/pytest.yml.disabled
vendored
Normal file
35
.github/workflows/pytest.yml.disabled
vendored
Normal file
@@ -0,0 +1,35 @@
|
||||
name: Pytest
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
branches:
|
||||
- master
|
||||
- main
|
||||
- dev
|
||||
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v4
|
||||
with:
|
||||
python-version: "3.12"
|
||||
cache: 'pip'
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install -r requirements.txt pytest
|
||||
|
||||
- name: Run non-cache tests
|
||||
run: pytest tests/ --ignore tests/test_cache.py --ignore tests/test_price_repair.py
|
||||
|
||||
- name: Run cache tests
|
||||
run: |
|
||||
pytest tests/test_cache.py::TestCache
|
||||
pytest tests/test_cache.py::TestCacheNoPermission
|
||||
@@ -1,6 +1,27 @@
|
||||
Change Log
|
||||
===========
|
||||
|
||||
0.2.54
|
||||
------
|
||||
Hotfix user-agent #2277
|
||||
|
||||
0.2.53
|
||||
------
|
||||
Fixes:
|
||||
- Fix: Failed to parse holders JSON data #2234
|
||||
- Fix: Bad data in Holders #2244
|
||||
- Stop CSRF-cookie-fetch fail killing yfinance #2249
|
||||
- Fix Market Docs #2250
|
||||
- Fix: Broken "See also" links in documentation #2253
|
||||
- Fix: Interval check and error message formatting in multi.py #2256
|
||||
Improve:
|
||||
- Add pre- / post-stock prices (and other useful information) #2212
|
||||
- Warn user when use download() without specifying auto_adjust #2230
|
||||
- Refactor: Earnings Dates – Switch to API Fetching #2247
|
||||
- Improve prices div repair #2260
|
||||
Maintenance:
|
||||
- Add GitHub Actions workflow and fix failing tests #2233
|
||||
|
||||
0.2.52
|
||||
------
|
||||
Features:
|
||||
|
||||
@@ -29,7 +29,8 @@ exclude_patterns = []
|
||||
autoclass_content = 'both'
|
||||
autosummary_generate = True
|
||||
autodoc_default_options = {
|
||||
'exclude-members': '__init__'
|
||||
'exclude-members': '__init__',
|
||||
'members': True,
|
||||
}
|
||||
|
||||
# -- Options for HTML output -------------------------------------------------
|
||||
|
||||
@@ -15,14 +15,15 @@ The following are the publicly available classes, and functions exposed by the `
|
||||
|
||||
- :attr:`Ticker <yfinance.Ticker>`: Class for accessing single ticker data.
|
||||
- :attr:`Tickers <yfinance.Tickers>`: Class for handling multiple tickers.
|
||||
- :attr:`MarketSummary <yfinance.MarketSummary>`: Class for accessing market summary.
|
||||
- :attr:`Market <yfinance.Market>`: Class for accessing market summary.
|
||||
- :attr:`download <yfinance.download>`: Function to download market data for multiple tickers.
|
||||
- :attr:`Search <yfinance.Search>`: Class for accessing search results.
|
||||
- :attr:`Sector <yfinance.Sector>`: Domain class for accessing sector information.
|
||||
- :attr:`Industry <yfinance.Industry>`: Domain class for accessing industry information.
|
||||
- :attr:`download <yfinance.download>`: Function to download market data for multiple tickers.
|
||||
- :attr:`EquityOperation <yfinance.EquityOperation>`: Class to build equity market operation.
|
||||
- :attr:`Query <yfinance.Query>`: Class to build query.
|
||||
- :attr:`Screener <yfinance.Screener>`: Class to screen the market using defined query.
|
||||
- :attr:`Market <yfinance.Market>`: Class for accessing market status & summary.
|
||||
- :attr:`EquityQuery <yfinance.EquityQuery>`: Class to build equity query filters.
|
||||
- :attr:`FundQuery <yfinance.FundQuery>`: Class to build fund query filters.
|
||||
- :attr:`screen <yfinance.screen>`: Run equity/fund queries.
|
||||
- :attr:`enable_debug_mode <yfinance.enable_debug_mode>`: Function to enable debug mode for logging.
|
||||
- :attr:`set_tz_cache_location <yfinance.set_tz_cache_location>`: Function to set the timezone cache location.
|
||||
|
||||
@@ -33,9 +34,10 @@ The following are the publicly available classes, and functions exposed by the `
|
||||
|
||||
yfinance.ticker_tickers
|
||||
yfinance.stock
|
||||
yfinance.market
|
||||
yfinance.financials
|
||||
yfinance.analysis
|
||||
yfinance.marketsummary
|
||||
yfinance.market
|
||||
yfinance.search
|
||||
yfinance.sector_industry
|
||||
yfinance.screener
|
||||
|
||||
@@ -1,16 +1,41 @@
|
||||
=====================
|
||||
Market Summary
|
||||
Market
|
||||
=====================
|
||||
|
||||
.. currentmodule:: yfinance
|
||||
|
||||
|
||||
Class
|
||||
------------
|
||||
The `Market` class, allows you to access market data in a Pythonic way.
|
||||
|
||||
.. autosummary::
|
||||
:toctree: api/
|
||||
|
||||
Market
|
||||
|
||||
Market Sample Code
|
||||
--------------------------
|
||||
The `Market` class, allows you to access market summary data in a Pythonic way.
|
||||
------------------
|
||||
|
||||
.. literalinclude:: examples/market.py
|
||||
:language: python
|
||||
:language: python
|
||||
|
||||
|
||||
Markets
|
||||
------------
|
||||
There are 8 different markets available in Yahoo Finance.
|
||||
|
||||
* US
|
||||
* GB
|
||||
|
||||
\
|
||||
|
||||
* ASIA
|
||||
* EUROPE
|
||||
|
||||
\
|
||||
|
||||
* RATES
|
||||
* COMMODITIES
|
||||
* CURRENCIES
|
||||
* CRYPTOCURRENCIES
|
||||
@@ -1,5 +1,5 @@
|
||||
{% set name = "yfinance" %}
|
||||
{% set version = "0.2.52" %}
|
||||
{% set version = "0.2.54" %}
|
||||
|
||||
package:
|
||||
name: "{{ name|lower }}"
|
||||
|
||||
@@ -2,12 +2,10 @@ pandas>=1.3.0
|
||||
numpy>=1.16.5
|
||||
requests>=2.31
|
||||
multitasking>=0.0.7
|
||||
lxml>=4.9.1
|
||||
platformdirs>=2.0.0
|
||||
pytz>=2022.5
|
||||
frozendict>=2.3.4
|
||||
beautifulsoup4>=4.11.1
|
||||
html5lib>=1.1
|
||||
peewee>=3.16.2
|
||||
requests_cache>=1.0
|
||||
requests_ratelimiter>=0.3.1
|
||||
|
||||
4
setup.py
4
setup.py
@@ -61,9 +61,9 @@ setup(
|
||||
packages=find_packages(exclude=['contrib', 'docs', 'tests', 'examples']),
|
||||
install_requires=['pandas>=1.3.0', 'numpy>=1.16.5',
|
||||
'requests>=2.31', 'multitasking>=0.0.7',
|
||||
'lxml>=4.9.1', 'platformdirs>=2.0.0', 'pytz>=2022.5',
|
||||
'platformdirs>=2.0.0', 'pytz>=2022.5',
|
||||
'frozendict>=2.3.4', 'peewee>=3.16.2',
|
||||
'beautifulsoup4>=4.11.1', 'html5lib>=1.1'],
|
||||
'beautifulsoup4>=4.11.1'],
|
||||
extras_require={
|
||||
'nospam': ['requests_cache>=1.0', 'requests_ratelimiter>=0.3.1'],
|
||||
'repair': ['scipy>=1.6.3'],
|
||||
|
||||
@@ -5,9 +5,7 @@ import datetime as _dt
|
||||
import sys
|
||||
import os
|
||||
import yfinance
|
||||
from requests import Session
|
||||
from requests_cache import CacheMixin, SQLiteCache
|
||||
from requests_ratelimiter import LimiterMixin, MemoryQueueBucket
|
||||
from requests_ratelimiter import LimiterSession
|
||||
from pyrate_limiter import Duration, RequestRate, Limiter
|
||||
|
||||
_parent_dp = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
|
||||
@@ -27,19 +25,21 @@ if os.path.isdir(testing_cache_dirpath):
|
||||
import shutil
|
||||
shutil.rmtree(testing_cache_dirpath)
|
||||
|
||||
|
||||
# Setup a session to rate-limit and cache persistently:
|
||||
class CachedLimiterSession(CacheMixin, LimiterMixin, Session):
|
||||
pass
|
||||
history_rate = RequestRate(1, Duration.SECOND*2)
|
||||
# Setup a session to only rate-limit
|
||||
history_rate = RequestRate(1, Duration.SECOND)
|
||||
limiter = Limiter(history_rate)
|
||||
cache_fp = os.path.join(testing_cache_dirpath, "unittests-cache")
|
||||
session_gbl = CachedLimiterSession(
|
||||
limiter=limiter,
|
||||
bucket_class=MemoryQueueBucket,
|
||||
backend=SQLiteCache(cache_fp, expire_after=_dt.timedelta(hours=1)),
|
||||
)
|
||||
# Use this instead if only want rate-limiting:
|
||||
# from requests_ratelimiter import LimiterSession
|
||||
# session_gbl = LimiterSession(limiter=limiter)
|
||||
session_gbl = LimiterSession(limiter=limiter)
|
||||
|
||||
# Use this instead if you also want caching:
|
||||
# from requests_cache import CacheMixin, SQLiteCache
|
||||
# from requests_ratelimiter import LimiterMixin
|
||||
# from requests import Session
|
||||
# from pyrate_limiter import MemoryQueueBucket
|
||||
# class CachedLimiterSession(CacheMixin, LimiterMixin, Session):
|
||||
# pass
|
||||
# cache_fp = os.path.join(testing_cache_dirpath, "unittests-cache")
|
||||
# session_gbl = CachedLimiterSession(
|
||||
# limiter=limiter,
|
||||
# bucket_class=MemoryQueueBucket,
|
||||
# backend=SQLiteCache(cache_fp, expire_after=_dt.timedelta(hours=1)),
|
||||
# )
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
93
tests/test_cache.py
Normal file
93
tests/test_cache.py
Normal file
@@ -0,0 +1,93 @@
|
||||
"""
|
||||
Tests for cache
|
||||
|
||||
To run all tests in suite from commandline:
|
||||
python -m unittest tests.cache
|
||||
|
||||
Specific test class:
|
||||
python -m unittest tests.cache.TestCache
|
||||
|
||||
"""
|
||||
from unittest import TestSuite
|
||||
|
||||
from tests.context import yfinance as yf
|
||||
|
||||
import unittest
|
||||
import tempfile
|
||||
import os
|
||||
|
||||
|
||||
class TestCache(unittest.TestCase):
|
||||
@classmethod
|
||||
def setUpClass(cls):
|
||||
cls.tempCacheDir = tempfile.TemporaryDirectory()
|
||||
yf.set_tz_cache_location(cls.tempCacheDir.name)
|
||||
|
||||
@classmethod
|
||||
def tearDownClass(cls):
|
||||
yf.cache._TzDBManager.close_db()
|
||||
cls.tempCacheDir.cleanup()
|
||||
|
||||
def test_storeTzNoRaise(self):
|
||||
# storing TZ to cache should never raise exception
|
||||
tkr = 'AMZN'
|
||||
tz1 = "America/New_York"
|
||||
tz2 = "London/Europe"
|
||||
cache = yf.cache.get_tz_cache()
|
||||
cache.store(tkr, tz1)
|
||||
cache.store(tkr, tz2)
|
||||
|
||||
def test_setTzCacheLocation(self):
|
||||
self.assertEqual(yf.cache._TzDBManager.get_location(), self.tempCacheDir.name)
|
||||
|
||||
tkr = 'AMZN'
|
||||
tz1 = "America/New_York"
|
||||
cache = yf.cache.get_tz_cache()
|
||||
cache.store(tkr, tz1)
|
||||
|
||||
self.assertTrue(os.path.exists(os.path.join(self.tempCacheDir.name, "tkr-tz.db")))
|
||||
|
||||
|
||||
class TestCacheNoPermission(unittest.TestCase):
|
||||
@classmethod
|
||||
def setUpClass(cls):
|
||||
if os.name == "nt": # Windows
|
||||
cls.cache_path = "C:\\Windows\\System32\\yf-cache"
|
||||
else: # Unix/Linux/MacOS
|
||||
# Use a writable directory
|
||||
cls.cache_path = "/yf-cache"
|
||||
yf.set_tz_cache_location(cls.cache_path)
|
||||
|
||||
def test_tzCacheRootStore(self):
|
||||
# Test that if cache path in read-only filesystem, no exception.
|
||||
tkr = 'AMZN'
|
||||
tz1 = "America/New_York"
|
||||
|
||||
# During attempt to store, will discover cannot write
|
||||
yf.cache.get_tz_cache().store(tkr, tz1)
|
||||
|
||||
# Handling the store failure replaces cache with a dummy
|
||||
cache = yf.cache.get_tz_cache()
|
||||
self.assertTrue(cache.dummy)
|
||||
cache.store(tkr, tz1)
|
||||
|
||||
def test_tzCacheRootLookup(self):
|
||||
# Test that if cache path in read-only filesystem, no exception.
|
||||
tkr = 'AMZN'
|
||||
# During attempt to lookup, will discover cannot write
|
||||
yf.cache.get_tz_cache().lookup(tkr)
|
||||
|
||||
# Handling the lookup failure replaces cache with a dummy
|
||||
cache = yf.cache.get_tz_cache()
|
||||
self.assertTrue(cache.dummy)
|
||||
cache.lookup(tkr)
|
||||
|
||||
def suite():
|
||||
ts: TestSuite = unittest.TestSuite()
|
||||
ts.addTest(TestCache('Test cache'))
|
||||
ts.addTest(TestCacheNoPermission('Test cache no permission'))
|
||||
return ts
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
||||
@@ -367,9 +367,9 @@ class TestPriceRepair(unittest.TestCase):
|
||||
"Close": [103.03, 102.05, 102.08],
|
||||
"Adj Close": [102.03, 102.05, 102.08],
|
||||
"Volume": [560, 137, 117]},
|
||||
index=_pd.to_datetime([_dt.datetime(2022, 11, 1),
|
||||
_dt.datetime(2022, 10, 31),
|
||||
_dt.datetime(2022, 10, 30)]))
|
||||
index=_pd.to_datetime([_dt.datetime(2024, 11, 1),
|
||||
_dt.datetime(2024, 10, 31),
|
||||
_dt.datetime(2024, 10, 30)]))
|
||||
df_bad = df_bad.sort_index()
|
||||
df_bad.index.name = "Date"
|
||||
df_bad.index = df_bad.index.tz_localize(tz_exchange)
|
||||
@@ -377,9 +377,9 @@ class TestPriceRepair(unittest.TestCase):
|
||||
repaired_df = hist._fix_zeroes(df_bad, "1d", tz_exchange, prepost=False)
|
||||
|
||||
correct_df = df_bad.copy()
|
||||
correct_df.loc["2022-11-01", "Open"] = 102.080002
|
||||
correct_df.loc["2022-11-01", "Low"] = 102.032501
|
||||
correct_df.loc["2022-11-01", "High"] = 102.080002
|
||||
correct_df.loc["2024-11-01", "Open"] = 102.572729
|
||||
correct_df.loc["2024-11-01", "Low"] = 102.309091
|
||||
correct_df.loc["2024-11-01", "High"] = 102.572729
|
||||
for c in ["Open", "Low", "High", "Close"]:
|
||||
self.assertTrue(_np.isclose(repaired_df[c], correct_df[c], rtol=1e-8).all())
|
||||
|
||||
@@ -462,7 +462,7 @@ class TestPriceRepair(unittest.TestCase):
|
||||
# Stocks that split in 2022 but no problems in Yahoo data,
|
||||
# so repair should change nothing
|
||||
good_tkrs = ['AMZN', 'DXCM', 'FTNT', 'GOOG', 'GME', 'PANW', 'SHOP', 'TSLA']
|
||||
good_tkrs += ['AEI', 'GHI', 'IRON', 'LXU', 'NUZE', 'RSLS', 'TISI']
|
||||
good_tkrs += ['AEI', 'GHI', 'IRON', 'LXU', 'RSLS', 'TISI']
|
||||
good_tkrs += ['BOL.ST', 'TUI1.DE']
|
||||
intervals = ['1d', '1wk', '1mo', '3mo']
|
||||
for tkr in good_tkrs:
|
||||
@@ -580,7 +580,6 @@ class TestPriceRepair(unittest.TestCase):
|
||||
# Div 100x
|
||||
bad_tkrs += ['ABDP.L']
|
||||
bad_tkrs += ['ELCO.L']
|
||||
bad_tkrs += ['KWS.L']
|
||||
bad_tkrs += ['PSH.L']
|
||||
|
||||
# Div 100x and adjust too big
|
||||
|
||||
@@ -118,7 +118,7 @@ class TestPriceHistory(unittest.TestCase):
|
||||
continue
|
||||
test_run = True
|
||||
|
||||
df = dat.history(start=dt.date() - _dt.timedelta(days=7), interval="1wk")
|
||||
df = dat.history(start=dt.date() - _dt.timedelta(days=13), interval="1wk")
|
||||
dt0 = df.index[-2]
|
||||
dt1 = df.index[-1]
|
||||
try:
|
||||
@@ -401,7 +401,7 @@ class TestPriceHistory(unittest.TestCase):
|
||||
|
||||
# Setup
|
||||
tkr = "AMZN"
|
||||
special_day = _dt.date(2023, 11, 24)
|
||||
special_day = _dt.date(2024, 11, 29)
|
||||
time_early_close = _dt.time(13)
|
||||
dat = yf.Ticker(tkr, session=self.session)
|
||||
|
||||
@@ -427,8 +427,8 @@ class TestPriceHistory(unittest.TestCase):
|
||||
dat = yf.Ticker(tkr, session=self.session)
|
||||
|
||||
# Test no other afternoons (or mornings) were pruned
|
||||
start_d = _dt.date(2023, 1, 1)
|
||||
end_d = _dt.date(2023+1, 1, 1)
|
||||
start_d = _dt.date(2024, 1, 1)
|
||||
end_d = _dt.date(2024+1, 1, 1)
|
||||
df = dat.history(start=start_d, end=end_d, interval="1h", prepost=False, keepna=True)
|
||||
last_dts = _pd.Series(df.index).groupby(df.index.date).last()
|
||||
dfd = dat.history(start=start_d, end=end_d, interval='1d', prepost=False, keepna=True)
|
||||
|
||||
@@ -180,7 +180,7 @@ class TestTicker(unittest.TestCase):
|
||||
expected_start = expected_start.replace(hour=0, minute=0, second=0, microsecond=0)
|
||||
|
||||
# leeway added because of weekends
|
||||
self.assertGreaterEqual(actual_start, expected_start - timedelta(days=7),
|
||||
self.assertGreaterEqual(actual_start, expected_start - timedelta(days=10),
|
||||
f"Start date {actual_start} out of range for period={period}")
|
||||
self.assertLessEqual(df.index[-1].to_pydatetime().replace(tzinfo=None), now,
|
||||
f"End date {df.index[-1]} out of range for period={period}")
|
||||
@@ -308,14 +308,13 @@ class TestTickerHistory(unittest.TestCase):
|
||||
actual_urls_called[i] = u
|
||||
actual_urls_called = tuple(actual_urls_called)
|
||||
|
||||
expected_urls = (
|
||||
f"https://query2.finance.yahoo.com/v8/finance/chart/{symbol}?events=div%2Csplits%2CcapitalGains&includePrePost=False&interval=1d&range={period}",
|
||||
)
|
||||
self.assertEqual(
|
||||
expected_urls,
|
||||
actual_urls_called,
|
||||
"Different than expected url used to fetch history."
|
||||
)
|
||||
expected_urls = [
|
||||
f"https://query2.finance.yahoo.com/v8/finance/chart/{symbol}?interval=1d&range=1d", # ticker's tz
|
||||
f"https://query2.finance.yahoo.com/v8/finance/chart/{symbol}?events=div%2Csplits%2CcapitalGains&includePrePost=False&interval=1d&range={period}"
|
||||
]
|
||||
for url in actual_urls_called:
|
||||
self.assertTrue(url in expected_urls, f"Unexpected URL called: {url}")
|
||||
|
||||
def test_dividends(self):
|
||||
data = self.ticker.dividends
|
||||
self.assertIsInstance(data, pd.Series, "data has wrong type")
|
||||
@@ -358,7 +357,7 @@ class TestTickerEarnings(unittest.TestCase):
|
||||
def test_earnings_dates_with_limit(self):
|
||||
# use ticker with lots of historic earnings
|
||||
ticker = yf.Ticker("IBM")
|
||||
limit = 110
|
||||
limit = 100
|
||||
data = ticker.get_earnings_dates(limit=limit)
|
||||
self.assertIsInstance(data, pd.DataFrame, "data has wrong type")
|
||||
self.assertFalse(data.empty, "data is empty")
|
||||
@@ -819,9 +818,6 @@ class TestTickerAnalysts(unittest.TestCase):
|
||||
data = self.ticker.analyst_price_targets
|
||||
self.assertIsInstance(data, dict, "data has wrong type")
|
||||
|
||||
keys = {'current', 'low', 'high', 'mean', 'median'}
|
||||
self.assertCountEqual(data.keys(), keys, "data has wrong keys")
|
||||
|
||||
data_cached = self.ticker.analyst_price_targets
|
||||
self.assertIs(data, data_cached, "data not cached")
|
||||
|
||||
@@ -830,12 +826,6 @@ class TestTickerAnalysts(unittest.TestCase):
|
||||
self.assertIsInstance(data, pd.DataFrame, "data has wrong type")
|
||||
self.assertFalse(data.empty, "data is empty")
|
||||
|
||||
columns = ['numberOfAnalysts', 'avg', 'low', 'high', 'yearAgoEps', 'growth']
|
||||
self.assertCountEqual(data.columns.values.tolist(), columns, "data has wrong column names")
|
||||
|
||||
index = ['0q', '+1q', '0y', '+1y']
|
||||
self.assertCountEqual(data.index.values.tolist(), index, "data has wrong row names")
|
||||
|
||||
data_cached = self.ticker.earnings_estimate
|
||||
self.assertIs(data, data_cached, "data not cached")
|
||||
|
||||
@@ -844,12 +834,6 @@ class TestTickerAnalysts(unittest.TestCase):
|
||||
self.assertIsInstance(data, pd.DataFrame, "data has wrong type")
|
||||
self.assertFalse(data.empty, "data is empty")
|
||||
|
||||
columns = ['numberOfAnalysts', 'avg', 'low', 'high', 'yearAgoRevenue', 'growth']
|
||||
self.assertCountEqual(data.columns.values.tolist(), columns, "data has wrong column names")
|
||||
|
||||
index = ['0q', '+1q', '0y', '+1y']
|
||||
self.assertCountEqual(data.index.values.tolist(), index, "data has wrong row names")
|
||||
|
||||
data_cached = self.ticker.revenue_estimate
|
||||
self.assertIs(data, data_cached, "data not cached")
|
||||
|
||||
@@ -858,8 +842,6 @@ class TestTickerAnalysts(unittest.TestCase):
|
||||
self.assertIsInstance(data, pd.DataFrame, "data has wrong type")
|
||||
self.assertFalse(data.empty, "data is empty")
|
||||
|
||||
columns = ['epsEstimate', 'epsActual', 'epsDifference', 'surprisePercent']
|
||||
self.assertCountEqual(data.columns.values.tolist(), columns, "data has wrong column names")
|
||||
self.assertIsInstance(data.index, pd.DatetimeIndex, "data has wrong index type")
|
||||
|
||||
data_cached = self.ticker.earnings_history
|
||||
@@ -870,12 +852,6 @@ class TestTickerAnalysts(unittest.TestCase):
|
||||
self.assertIsInstance(data, pd.DataFrame, "data has wrong type")
|
||||
self.assertFalse(data.empty, "data is empty")
|
||||
|
||||
columns = ['current', '7daysAgo', '30daysAgo', '60daysAgo', '90daysAgo']
|
||||
self.assertCountEqual(data.columns.values.tolist(), columns, "data has wrong column names")
|
||||
|
||||
index = ['0q', '+1q', '0y', '+1y']
|
||||
self.assertCountEqual(data.index.values.tolist(), index, "data has wrong row names")
|
||||
|
||||
data_cached = self.ticker.eps_trend
|
||||
self.assertIs(data, data_cached, "data not cached")
|
||||
|
||||
@@ -884,12 +860,6 @@ class TestTickerAnalysts(unittest.TestCase):
|
||||
self.assertIsInstance(data, pd.DataFrame, "data has wrong type")
|
||||
self.assertFalse(data.empty, "data is empty")
|
||||
|
||||
columns = ['stockTrend', 'indexTrend']
|
||||
self.assertCountEqual(data.columns.values.tolist(), columns, "data has wrong column names")
|
||||
|
||||
index = ['0q', '+1q', '0y', '+1y', '+5y']
|
||||
self.assertCountEqual(data.index.values.tolist(), index, "data has wrong row names")
|
||||
|
||||
data_cached = self.ticker.growth_estimates
|
||||
self.assertIs(data, data_cached, "data not cached")
|
||||
|
||||
|
||||
@@ -12,78 +12,12 @@ from datetime import datetime
|
||||
from unittest import TestSuite
|
||||
|
||||
import pandas as pd
|
||||
# import numpy as np
|
||||
|
||||
from tests.context import yfinance as yf
|
||||
|
||||
import unittest
|
||||
# import requests_cache
|
||||
import tempfile
|
||||
import os
|
||||
|
||||
from yfinance.utils import is_valid_period_format
|
||||
|
||||
|
||||
class TestCache(unittest.TestCase):
|
||||
@classmethod
|
||||
def setUpClass(cls):
|
||||
cls.tempCacheDir = tempfile.TemporaryDirectory()
|
||||
yf.set_tz_cache_location(cls.tempCacheDir.name)
|
||||
|
||||
@classmethod
|
||||
def tearDownClass(cls):
|
||||
cls.tempCacheDir.cleanup()
|
||||
|
||||
def test_storeTzNoRaise(self):
|
||||
# storing TZ to cache should never raise exception
|
||||
tkr = 'AMZN'
|
||||
tz1 = "America/New_York"
|
||||
tz2 = "London/Europe"
|
||||
cache = yf.cache.get_tz_cache()
|
||||
cache.store(tkr, tz1)
|
||||
cache.store(tkr, tz2)
|
||||
|
||||
def test_setTzCacheLocation(self):
|
||||
self.assertEqual(yf.cache._TzDBManager.get_location(), self.tempCacheDir.name)
|
||||
|
||||
tkr = 'AMZN'
|
||||
tz1 = "America/New_York"
|
||||
cache = yf.cache.get_tz_cache()
|
||||
cache.store(tkr, tz1)
|
||||
|
||||
self.assertTrue(os.path.exists(os.path.join(self.tempCacheDir.name, "tkr-tz.db")))
|
||||
|
||||
|
||||
class TestCacheNoPermission(unittest.TestCase):
|
||||
@classmethod
|
||||
def setUpClass(cls):
|
||||
yf.set_tz_cache_location("/root/yf-cache")
|
||||
|
||||
def test_tzCacheRootStore(self):
|
||||
# Test that if cache path in read-only filesystem, no exception.
|
||||
tkr = 'AMZN'
|
||||
tz1 = "America/New_York"
|
||||
|
||||
# During attempt to store, will discover cannot write
|
||||
yf.cache.get_tz_cache().store(tkr, tz1)
|
||||
|
||||
# Handling the store failure replaces cache with a dummy
|
||||
cache = yf.cache.get_tz_cache()
|
||||
self.assertTrue(cache.dummy)
|
||||
cache.store(tkr, tz1)
|
||||
|
||||
def test_tzCacheRootLookup(self):
|
||||
# Test that if cache path in read-only filesystem, no exception.
|
||||
tkr = 'AMZN'
|
||||
# During attempt to lookup, will discover cannot write
|
||||
yf.cache.get_tz_cache().lookup(tkr)
|
||||
|
||||
# Handling the lookup failure replaces cache with a dummy
|
||||
cache = yf.cache.get_tz_cache()
|
||||
self.assertTrue(cache.dummy)
|
||||
cache.lookup(tkr)
|
||||
|
||||
|
||||
class TestPandas(unittest.TestCase):
|
||||
date_strings = ["2024-08-07 09:05:00+02:00", "2024-08-07 09:05:00-04:00"]
|
||||
|
||||
@@ -129,8 +63,6 @@ class TestUtils(unittest.TestCase):
|
||||
|
||||
def suite():
|
||||
ts: TestSuite = unittest.TestSuite()
|
||||
ts.addTest(TestCache('Test cache'))
|
||||
ts.addTest(TestCacheNoPermission('Test cache no permission'))
|
||||
ts.addTest(TestPandas("Test pandas"))
|
||||
ts.addTest(TestUtils("Test utils"))
|
||||
return ts
|
||||
|
||||
121
yfinance/base.py
121
yfinance/base.py
@@ -21,15 +21,14 @@
|
||||
|
||||
from __future__ import print_function
|
||||
|
||||
from io import StringIO
|
||||
import json as _json
|
||||
import warnings
|
||||
from typing import Optional, Union
|
||||
from urllib.parse import quote as urlencode
|
||||
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
import requests
|
||||
from datetime import date
|
||||
|
||||
from . import utils, cache
|
||||
from .data import YfData
|
||||
@@ -41,7 +40,7 @@ from .scrapers.quote import Quote, FastInfo
|
||||
from .scrapers.history import PriceHistory
|
||||
from .scrapers.funds import FundsData
|
||||
|
||||
from .const import _BASE_URL_, _ROOT_URL_
|
||||
from .const import _BASE_URL_, _ROOT_URL_, _QUERY1_URL_
|
||||
|
||||
|
||||
class TickerBase:
|
||||
@@ -593,94 +592,62 @@ class TickerBase:
|
||||
Returns:
|
||||
pd.DataFrame
|
||||
"""
|
||||
if self._earnings_dates and limit in self._earnings_dates:
|
||||
return self._earnings_dates[limit]
|
||||
|
||||
logger = utils.get_yf_logger()
|
||||
clamped_limit = min(limit, 100) # YF caps at 100, don't go higher
|
||||
|
||||
page_size = min(limit, 100) # YF caps at 100, don't go higher
|
||||
page_offset = 0
|
||||
dates = None
|
||||
while True:
|
||||
url = f"{_ROOT_URL_}/calendar/earnings?day={date.today()}&symbol={self.ticker}&offset={page_offset}&size={page_size}"
|
||||
data = self._data.cache_get(url=url, proxy=proxy).text
|
||||
if self._earnings_dates and clamped_limit in self._earnings_dates:
|
||||
return self._earnings_dates[clamped_limit]
|
||||
|
||||
if "Will be right back" in data:
|
||||
raise RuntimeError("*** YAHOO! FINANCE IS CURRENTLY DOWN! ***\n"
|
||||
"Our engineers are working quickly to resolve "
|
||||
"the issue. Thank you for your patience.")
|
||||
# Fetch data
|
||||
url = f"{_QUERY1_URL_}/v1/finance/visualization"
|
||||
params = {"lang": "en-US", "region": "US"}
|
||||
body = {
|
||||
"size": clamped_limit,
|
||||
"query": {
|
||||
"operator": "and",
|
||||
"operands": [
|
||||
{"operator": "eq", "operands": ["ticker", self.ticker]},
|
||||
{"operator": "eq", "operands": ["eventtype", "2"]}
|
||||
]
|
||||
},
|
||||
"sortField": "startdatetime",
|
||||
"sortType": "DESC",
|
||||
"entityIdType": "earnings",
|
||||
"includeFields": ["startdatetime", "timeZoneShortName", "epsestimate", "epsactual", "epssurprisepct"]
|
||||
}
|
||||
response = self._data.post(url, params=params, body=body, proxy=proxy)
|
||||
json_data = response.json()
|
||||
|
||||
try:
|
||||
data = pd.read_html(StringIO(data))[0]
|
||||
except ValueError:
|
||||
if page_offset == 0:
|
||||
# Should not fail on first page
|
||||
if "Showing Earnings for:" in data:
|
||||
# Actually YF was successful, problem is company doesn't have earnings history
|
||||
dates = utils.empty_earnings_dates_df()
|
||||
break
|
||||
if dates is None:
|
||||
dates = data
|
||||
else:
|
||||
dates = pd.concat([dates, data], axis=0)
|
||||
# Extract data
|
||||
columns = [row['label'] for row in json_data['finance']['result'][0]['documents'][0]['columns']]
|
||||
rows = json_data['finance']['result'][0]['documents'][0]['rows']
|
||||
df = pd.DataFrame(rows, columns=columns)
|
||||
|
||||
page_offset += page_size
|
||||
# got less data then we asked for or already fetched all we requested, no need to fetch more pages
|
||||
if len(data) < page_size or len(dates) >= limit:
|
||||
dates = dates.iloc[:limit]
|
||||
break
|
||||
else:
|
||||
# do not fetch more than needed next time
|
||||
page_size = min(limit - len(dates), page_size)
|
||||
|
||||
if dates is None or dates.shape[0] == 0:
|
||||
if df.empty:
|
||||
_exception = YFEarningsDateMissing(self.ticker)
|
||||
err_msg = str(_exception)
|
||||
logger.error(f'{self.ticker}: {err_msg}')
|
||||
return None
|
||||
dates = dates.reset_index(drop=True)
|
||||
|
||||
# Drop redundant columns
|
||||
dates = dates.drop(["Symbol", "Company"], axis=1)
|
||||
|
||||
# Compatibility
|
||||
dates = dates.rename(columns={'Surprise (%)': 'Surprise(%)'})
|
||||
|
||||
# Drop empty rows
|
||||
for i in range(len(dates)-1, -1, -1):
|
||||
if dates.iloc[i].isna().all():
|
||||
dates = dates.drop(i)
|
||||
# Calculate earnings date
|
||||
df['Earnings Date'] = pd.to_datetime(df['Event Start Date']).dt.normalize()
|
||||
tz = self._get_ticker_tz(proxy=proxy, timeout=30)
|
||||
if df['Earnings Date'].dt.tz is None:
|
||||
df['Earnings Date'] = df['Earnings Date'].dt.tz_localize(tz)
|
||||
else:
|
||||
df['Earnings Date'] = df['Earnings Date'].dt.tz_convert(tz)
|
||||
|
||||
# Convert types
|
||||
for cn in ["EPS Estimate", "Reported EPS", "Surprise(%)"]:
|
||||
dates.loc[dates[cn] == '-', cn] = float("nan")
|
||||
dates[cn] = dates[cn].astype(float)
|
||||
columns_to_update = ['Surprise (%)', 'EPS Estimate', 'Reported EPS']
|
||||
df[columns_to_update] = df[columns_to_update].astype('float64').replace(0.0, np.nan)
|
||||
|
||||
# Parse earnings date string
|
||||
cn = "Earnings Date"
|
||||
try:
|
||||
dates_backup = dates.copy()
|
||||
# - extract timezone because Yahoo stopped returning in UTC
|
||||
tzy = dates[cn].str.split(' ').str.get(-1)
|
||||
tzy[tzy.isin(['EDT', 'EST'])] = 'US/Eastern'
|
||||
# - tidy date string
|
||||
dates[cn] = dates[cn].str.split(' ').str[:-1].str.join(' ')
|
||||
dates[cn] = dates[cn].replace(' at', ',', regex=True)
|
||||
# - parse
|
||||
dates[cn] = pd.to_datetime(dates[cn], format="%B %d, %Y, %I %p")
|
||||
# - convert to exchange timezone
|
||||
self._quote.proxy = proxy or self.proxy
|
||||
tz = self._get_ticker_tz(proxy=proxy, timeout=30)
|
||||
dates[cn] = [dates[cn].iloc[i].tz_localize(tzy.iloc[i], ambiguous=True).tz_convert(tz) for i in range(len(dates))]
|
||||
# Format the dataframe
|
||||
df.drop(['Event Start Date', 'Timezone short name'], axis=1, inplace=True)
|
||||
df.set_index('Earnings Date', inplace=True)
|
||||
df.rename(columns={'Surprise (%)': 'Surprise(%)'}, inplace=True) # Compatibility
|
||||
|
||||
dates = dates.set_index("Earnings Date")
|
||||
except Exception as e:
|
||||
utils.get_yf_logger().info(f"{self.ticker}: Problem parsing earnings_dates: {str(e)}")
|
||||
dates = dates_backup
|
||||
|
||||
self._earnings_dates[limit] = dates
|
||||
|
||||
return dates
|
||||
self._earnings_dates[clamped_limit] = df
|
||||
return df
|
||||
|
||||
def get_history_metadata(self, proxy=None) -> dict:
|
||||
return self._lazy_load_price_history().get_history_metadata(proxy)
|
||||
|
||||
@@ -624,3 +624,21 @@ EQUITY_SCREENER_FIELDS = {
|
||||
"highest_controversy"}
|
||||
}
|
||||
EQUITY_SCREENER_FIELDS = merge_two_level_dicts(EQUITY_SCREENER_FIELDS, COMMON_SCREENER_FIELDS)
|
||||
|
||||
USER_AGENTS = [
|
||||
# Chrome
|
||||
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36",
|
||||
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36",
|
||||
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36",
|
||||
|
||||
# Firefox
|
||||
"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:135.0) Gecko/20100101 Firefox/135.0",
|
||||
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14.7; rv:135.0) Gecko/20100101 Firefox/135.0",
|
||||
"Mozilla/5.0 (X11; Linux i686; rv:135.0) Gecko/20100101 Firefox/135.0",
|
||||
|
||||
# Safari
|
||||
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3 Safari/605.1.15",
|
||||
|
||||
# Edge
|
||||
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36 Edg/131.0.2903.86"
|
||||
]
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import functools
|
||||
import random
|
||||
from functools import lru_cache
|
||||
|
||||
import requests as requests
|
||||
@@ -10,6 +11,7 @@ from frozendict import frozendict
|
||||
from . import utils, cache
|
||||
import threading
|
||||
|
||||
from .const import USER_AGENTS
|
||||
from .exceptions import YFRateLimitError
|
||||
|
||||
cache_maxsize = 64
|
||||
@@ -59,7 +61,8 @@ class YfData(metaclass=SingletonMeta):
|
||||
Singleton means one session one cookie shared by all threads.
|
||||
"""
|
||||
user_agent_headers = {
|
||||
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
|
||||
'User-Agent': random.choice(USER_AGENTS)
|
||||
}
|
||||
|
||||
def __init__(self, session=None):
|
||||
self._crumb = None
|
||||
@@ -231,11 +234,16 @@ class YfData(metaclass=SingletonMeta):
|
||||
'timeout': timeout}
|
||||
|
||||
get_args = {**base_args, 'url': 'https://guce.yahoo.com/consent'}
|
||||
if self._session_is_caching:
|
||||
get_args['expire_after'] = self._expire_after
|
||||
response = self._session.get(**get_args)
|
||||
else:
|
||||
response = self._session.get(**get_args)
|
||||
try:
|
||||
if self._session_is_caching:
|
||||
get_args['expire_after'] = self._expire_after
|
||||
response = self._session.get(**get_args)
|
||||
else:
|
||||
response = self._session.get(**get_args)
|
||||
except requests.exceptions.ChunkedEncodingError:
|
||||
# No idea why happens, but handle nicely so can switch to other cookie method.
|
||||
utils.get_yf_logger().debug('_get_cookie_csrf() encountering requests.exceptions.ChunkedEncodingError, aborting')
|
||||
return False
|
||||
|
||||
soup = BeautifulSoup(response.content, 'html.parser')
|
||||
csrfTokenInput = soup.find('input', attrs={'name': 'csrfToken'})
|
||||
@@ -264,14 +272,18 @@ class YfData(metaclass=SingletonMeta):
|
||||
get_args = {**base_args,
|
||||
'url': f'https://guce.yahoo.com/copyConsent?sessionId={sessionId}',
|
||||
'data': data}
|
||||
if self._session_is_caching:
|
||||
post_args['expire_after'] = self._expire_after
|
||||
get_args['expire_after'] = self._expire_after
|
||||
self._session.post(**post_args)
|
||||
self._session.get(**get_args)
|
||||
else:
|
||||
self._session.post(**post_args)
|
||||
self._session.get(**get_args)
|
||||
try:
|
||||
if self._session_is_caching:
|
||||
post_args['expire_after'] = self._expire_after
|
||||
get_args['expire_after'] = self._expire_after
|
||||
self._session.post(**post_args)
|
||||
self._session.get(**get_args)
|
||||
else:
|
||||
self._session.post(**post_args)
|
||||
self._session.get(**get_args)
|
||||
except requests.exceptions.ChunkedEncodingError:
|
||||
# No idea why happens, but handle nicely so can switch to other cookie method.
|
||||
utils.get_yf_logger().debug('_get_cookie_csrf() encountering requests.exceptions.ChunkedEncodingError, aborting')
|
||||
self._cookie = True
|
||||
self._save_session_cookies()
|
||||
return True
|
||||
|
||||
@@ -5,7 +5,7 @@ from ..data import utils
|
||||
from ..const import _QUERY1_URL_
|
||||
import json as _json
|
||||
|
||||
class Market():
|
||||
class Market:
|
||||
def __init__(self, market:'str', session=None, proxy=None, timeout=30):
|
||||
self.market = market
|
||||
self.session = session
|
||||
@@ -52,7 +52,7 @@ class Market():
|
||||
status_params = {
|
||||
"formatted": True,
|
||||
"key": "finance",
|
||||
"lang": "en-GB",
|
||||
"lang": "en-US",
|
||||
"market": self.market
|
||||
}
|
||||
|
||||
@@ -73,11 +73,11 @@ class Market():
|
||||
self._status['timezone'] = self._status['timezone'][0]
|
||||
del self._status['time'] # redundant
|
||||
try:
|
||||
self._status.update(
|
||||
open = dt.datetime.fromisoformat(self._status["open"]),
|
||||
close = dt.datetime.fromisoformat(self._status["close"]),
|
||||
tz = dt.timezone(self._status["timezone"]["gmtoffset"], self._status["timezone"]["short"])
|
||||
)
|
||||
self._status.update({
|
||||
"open": dt.datetime.fromisoformat(self._status["open"]),
|
||||
"close": dt.datetime.fromisoformat(self._status["close"]),
|
||||
"tz": dt.timezone(dt.timedelta(hours=int(self._status["timezone"]["gmtoffset"]))/1000, self._status["timezone"]["short"])
|
||||
})
|
||||
except Exception as e:
|
||||
self._logger.error(f"{self.market}: Failed to update market status")
|
||||
self._logger.debug(f"{type(e)}: {e}")
|
||||
|
||||
@@ -36,7 +36,7 @@ from . import shared
|
||||
|
||||
@utils.log_indent_decorator
|
||||
def download(tickers, start=None, end=None, actions=False, threads=True,
|
||||
ignore_tz=None, group_by='column', auto_adjust=True, back_adjust=False,
|
||||
ignore_tz=None, group_by='column', auto_adjust=None, back_adjust=False,
|
||||
repair=False, keepna=False, progress=True, period="max", interval="1d",
|
||||
prepost=False, proxy=None, rounding=False, timeout=10, session=None,
|
||||
multi_level_index=True) -> Union[_pd.DataFrame, None]:
|
||||
@@ -93,6 +93,11 @@ def download(tickers, start=None, end=None, actions=False, threads=True,
|
||||
"""
|
||||
logger = utils.get_yf_logger()
|
||||
|
||||
if auto_adjust is None:
|
||||
# Warn users that default has changed to True
|
||||
utils.print_once("YF.download() has changed argument auto_adjust default to True")
|
||||
auto_adjust = True
|
||||
|
||||
if logger.isEnabledFor(logging.DEBUG):
|
||||
if threads:
|
||||
# With DEBUG, each thread generates a lot of log messages.
|
||||
@@ -106,7 +111,7 @@ def download(tickers, start=None, end=None, actions=False, threads=True,
|
||||
|
||||
if ignore_tz is None:
|
||||
# Set default value depending on interval
|
||||
if interval[1:] in ['m', 'h']:
|
||||
if interval[-1] in ['m', 'h']:
|
||||
# Intraday
|
||||
ignore_tz = False
|
||||
else:
|
||||
@@ -180,7 +185,7 @@ def download(tickers, start=None, end=None, actions=False, threads=True,
|
||||
errors = {}
|
||||
for ticker in shared._ERRORS:
|
||||
err = shared._ERRORS[ticker]
|
||||
err = err.replace(f'{ticker}', '%ticker%')
|
||||
err = err.replace(f'${ticker}: ', '')
|
||||
if err not in errors:
|
||||
errors[err] = [ticker]
|
||||
else:
|
||||
@@ -192,7 +197,7 @@ def download(tickers, start=None, end=None, actions=False, threads=True,
|
||||
tbs = {}
|
||||
for ticker in shared._TRACEBACKS:
|
||||
tb = shared._TRACEBACKS[ticker]
|
||||
tb = tb.replace(f'{ticker}', '%ticker%')
|
||||
tb = tb.replace(f'${ticker}: ', '')
|
||||
if tb not in tbs:
|
||||
tbs[tb] = [ticker]
|
||||
else:
|
||||
|
||||
@@ -1304,7 +1304,7 @@ class PriceHistory:
|
||||
|
||||
if df is None or df.empty:
|
||||
return df
|
||||
if interval != '1d':
|
||||
if interval in ['1wk', '1mo', '3mo', '1y']:
|
||||
return df
|
||||
|
||||
logger = utils.get_yf_logger()
|
||||
@@ -1614,9 +1614,9 @@ class PriceHistory:
|
||||
checks += ['adj_missing', 'adj_exceeds_div', 'div_exceeds_adj']
|
||||
|
||||
div_status_df['phantom'] = False
|
||||
phantom_proximity_threshold = _datetime.timedelta(days=7)
|
||||
phantom_proximity_threshold = _datetime.timedelta(days=17)
|
||||
f = div_status_df[['div_too_big', 'div_exceeds_adj']].any(axis=1)
|
||||
if f.any():
|
||||
if f.any() and len(div_status_df) > 1:
|
||||
# One/some of these may be phantom dividends. Clue is if another correct dividend is very close
|
||||
indices = np.where(f)[0]
|
||||
dts_to_check = div_status_df.index[f]
|
||||
@@ -1625,37 +1625,24 @@ class PriceHistory:
|
||||
div_dt = div.name
|
||||
phantom_dt = None
|
||||
if i > 0:
|
||||
prev_div = div_status_df.iloc[i-1]
|
||||
ratio1 = (div['div']/currency_divide) / prev_div['div']
|
||||
ratio2 = div['div'] / prev_div['div']
|
||||
divergence = min(abs(ratio1-1.0), abs(ratio2-1.0))
|
||||
if abs(div_dt-prev_div.name) <= phantom_proximity_threshold and not prev_div['phantom'] and divergence < 0.01:
|
||||
if prev_div.name in dts_to_check:
|
||||
# Both this and previous are anomalous, so mark smallest drop as phantom
|
||||
drop = div['drop']
|
||||
drop_prev = prev_div['drop']
|
||||
if drop > 1.5*drop_prev:
|
||||
phantom_dt = prev_div.name
|
||||
else:
|
||||
phantom_dt = div_dt
|
||||
else:
|
||||
phantom_dt = div_dt
|
||||
elif i < len(div_status_df)-1:
|
||||
next_div = div_status_df.iloc[i+1]
|
||||
ratio1 = (div['div']/currency_divide) / next_div['div']
|
||||
ratio2 = div['div'] / next_div['div']
|
||||
divergence = min(abs(ratio1-1.0), abs(ratio2-1.0))
|
||||
if abs(div_dt-next_div.name) <= phantom_proximity_threshold and divergence < 0.01:
|
||||
if next_div.name in dts_to_check:
|
||||
# Both this and previous are anomalous, so mark smallest drop as phantom
|
||||
drop = div['drop']
|
||||
drop_next = next_div['drop']
|
||||
if drop > 1.5*drop_next:
|
||||
phantom_dt = next_div.name
|
||||
else:
|
||||
phantom_dt = div_dt
|
||||
other_div = div_status_df.iloc[i-1]
|
||||
else:
|
||||
other_div = div_status_df.iloc[i+1]
|
||||
ratio1 = (div['div']/currency_divide) / other_div['div']
|
||||
ratio2 = div['div'] / other_div['div']
|
||||
divergence = min(abs(ratio1-1.0), abs(ratio2-1.0))
|
||||
if abs(div_dt-other_div.name) <= phantom_proximity_threshold and not other_div['phantom'] and divergence < 0.01:
|
||||
if other_div.name in dts_to_check:
|
||||
# Both this and previous are anomalous, so mark smallest drop as phantom
|
||||
drop = div['drop']
|
||||
drop_next = other_div['drop']
|
||||
if drop > 1.5*drop_next:
|
||||
phantom_dt = other_div.name
|
||||
else:
|
||||
phantom_dt = div_dt
|
||||
else:
|
||||
phantom_dt = div_dt
|
||||
|
||||
if phantom_dt:
|
||||
div_status_df.loc[phantom_dt, 'phantom'] = True
|
||||
for c in checks:
|
||||
@@ -1754,7 +1741,7 @@ class PriceHistory:
|
||||
lookahead_idx = bisect.bisect_left(df2.index, lookahead_date)
|
||||
lookahead_idx = min(lookahead_idx, len(df2)-1)
|
||||
# In rare cases, the price dropped 1 day before dividend (DVD.OL @ 2024-05-15)
|
||||
lookback_idx = div_idx-2 if div_idx > 1 else div_idx-1
|
||||
lookback_idx = max(0, div_idx-14)
|
||||
# Check for bad stock splits in the lookahead period -
|
||||
# if present, reduce lookahead to before.
|
||||
future_changes = df2['Close'].iloc[div_idx:lookahead_idx+1].pct_change()
|
||||
@@ -1776,8 +1763,6 @@ class PriceHistory:
|
||||
adjDeltas = x['Adj Low'].iloc[1:].to_numpy() - x['Adj Close'].iloc[:-1].to_numpy()
|
||||
adjDeltas = np.append([0.0], adjDeltas)
|
||||
x['adjDelta'] = adjDeltas
|
||||
for i in np.where(x['Dividends']>0)[0]:
|
||||
x.loc[x.index[i], 'adjDelta'] += x['Dividends'].iloc[i]*x['Adj'].iloc[i]
|
||||
deltas = x[['delta', 'adjDelta']]
|
||||
if div_pct > 0.05 and div_pct < 1.0:
|
||||
adjDiv = div * x['Adj'].iloc[0]
|
||||
@@ -1912,7 +1897,7 @@ class PriceHistory:
|
||||
pct_fail = n_fail / n
|
||||
if c == 'div_too_big':
|
||||
true_threshold = 1.0
|
||||
fals_threshold = 0.2
|
||||
fals_threshold = 0.25
|
||||
|
||||
if 'div_date_wrong' in cluster.columns and (cluster[c] == cluster['div_date_wrong']).all():
|
||||
continue
|
||||
@@ -1991,7 +1976,7 @@ class PriceHistory:
|
||||
if c == 'div_date_wrong':
|
||||
# Fine, these should be rare
|
||||
continue
|
||||
if c == 'div_pre_split':
|
||||
if c in ['div_pre_split', 'div_too_big_and_pre_split']:
|
||||
# Fine, these should be rare
|
||||
continue
|
||||
|
||||
@@ -2227,6 +2212,26 @@ class PriceHistory:
|
||||
df2_nan.loc[:enddt, 'Repaired?'] = True
|
||||
cluster.loc[dt, 'Fixed?'] = True
|
||||
|
||||
elif n_failed_checks == 3:
|
||||
if div_too_big and div_exceeds_adj and div_pre_split:
|
||||
k = 'too-big div & pre-split'
|
||||
correction = (1.0/currency_divide) * (1.0/df2['Stock Splits'].loc[dt])
|
||||
correct_div = row['div'] * correction
|
||||
df2.loc[dt, 'Dividends'] = correct_div
|
||||
|
||||
target_div_pct = row['%'] * correction
|
||||
target_adj = 1.0 - target_div_pct
|
||||
present_adj = row['present adj']
|
||||
# Also correct adjustment to match corrected dividend
|
||||
k += ' & div-adjust'
|
||||
adj_correction = target_adj / present_adj
|
||||
df2.loc[ :enddt, 'Adj Close'] *= adj_correction
|
||||
df2.loc[ :enddt, 'Repaired?'] = True
|
||||
df2_nan.loc[:enddt, 'Adj Close'] *= adj_correction
|
||||
df2_nan.loc[:enddt, 'Repaired?'] = True
|
||||
cluster.loc[dt, 'Fixed?'] = True
|
||||
div_repairs.setdefault(k, []).append(dt)
|
||||
|
||||
if cluster.empty:
|
||||
continue
|
||||
|
||||
@@ -2482,14 +2487,14 @@ class PriceHistory:
|
||||
|
||||
r = _1d_change_x / split_rcp
|
||||
f_down = _1d_change_x < 1.0 / threshold
|
||||
if f_down.any():
|
||||
# Discard where triggered by negative Adj Close after dividend
|
||||
f_neg = _1d_change_x < 0.0
|
||||
f_div = (df2['Dividends']>0).to_numpy()
|
||||
f_div_before = np.roll(f_div, 1)
|
||||
if f_down.ndim == 2:
|
||||
f_div_before = f_div_before[:, np.newaxis].repeat(f_down.shape[1], axis=1)
|
||||
f_down = f_down & ~(f_neg + f_div_before)
|
||||
# if f_down.any():
|
||||
# # Discard where triggered by negative Adj Close after dividend
|
||||
# f_neg = _1d_change_x < 0.0
|
||||
# f_div = (df2['Dividends']>0).to_numpy()
|
||||
# f_div_before = np.roll(f_div, 1)
|
||||
# if f_down.ndim == 2:
|
||||
# f_div_before = f_div_before[:, np.newaxis].repeat(f_down.shape[1], axis=1)
|
||||
# f_down = f_down & ~(f_neg + f_div_before)
|
||||
f_up = _1d_change_x > threshold
|
||||
f_up_ndims = len(f_up.shape)
|
||||
f_up_shifts = f_up if f_up_ndims==1 else f_up.any(axis=1)
|
||||
@@ -2512,7 +2517,7 @@ class PriceHistory:
|
||||
# assume false positive
|
||||
continue
|
||||
avg_vol_after = df2['Volume'].iloc[lookback:i-1].mean()
|
||||
if not np.isnan(avg_vol_after) and v/avg_vol_after < 2.0:
|
||||
if not np.isnan(avg_vol_after) and avg_vol_after > 0 and v/avg_vol_after < 2.0:
|
||||
# volume spike is actually a step-change, so
|
||||
# probably missing stock split
|
||||
continue
|
||||
|
||||
@@ -1,5 +1,3 @@
|
||||
# from io import StringIO
|
||||
|
||||
import pandas as pd
|
||||
import requests
|
||||
|
||||
@@ -8,7 +6,7 @@ from yfinance.data import YfData
|
||||
from yfinance.const import _BASE_URL_
|
||||
from yfinance.exceptions import YFDataException
|
||||
|
||||
_QUOTE_SUMMARY_URL_ = f"{_BASE_URL_}/v10/finance/quoteSummary/"
|
||||
_QUOTE_SUMMARY_URL_ = f"{_BASE_URL_}/v10/finance/quoteSummary"
|
||||
|
||||
|
||||
class Holders:
|
||||
@@ -31,42 +29,36 @@ class Holders:
|
||||
@property
|
||||
def major(self) -> pd.DataFrame:
|
||||
if self._major is None:
|
||||
# self._scrape(self.proxy)
|
||||
self._fetch_and_parse()
|
||||
return self._major
|
||||
|
||||
@property
|
||||
def institutional(self) -> pd.DataFrame:
|
||||
if self._institutional is None:
|
||||
# self._scrape(self.proxy)
|
||||
self._fetch_and_parse()
|
||||
return self._institutional
|
||||
|
||||
@property
|
||||
def mutualfund(self) -> pd.DataFrame:
|
||||
if self._mutualfund is None:
|
||||
# self._scrape(self.proxy)
|
||||
self._fetch_and_parse()
|
||||
return self._mutualfund
|
||||
|
||||
@property
|
||||
def insider_transactions(self) -> pd.DataFrame:
|
||||
if self._insider_transactions is None:
|
||||
# self._scrape_insider_transactions(self.proxy)
|
||||
self._fetch_and_parse()
|
||||
return self._insider_transactions
|
||||
|
||||
@property
|
||||
def insider_purchases(self) -> pd.DataFrame:
|
||||
if self._insider_purchases is None:
|
||||
# self._scrape_insider_transactions(self.proxy)
|
||||
self._fetch_and_parse()
|
||||
return self._insider_purchases
|
||||
|
||||
@property
|
||||
def insider_roster(self) -> pd.DataFrame:
|
||||
if self._insider_roster is None:
|
||||
# self._scrape_insider_ros(self.proxy)
|
||||
self._fetch_and_parse()
|
||||
return self._insider_roster
|
||||
|
||||
@@ -187,8 +179,10 @@ class Holders:
|
||||
del owner["maxAge"]
|
||||
df = pd.DataFrame(holders)
|
||||
if not df.empty:
|
||||
df["positionDirectDate"] = pd.to_datetime(df["positionDirectDate"], unit="s")
|
||||
df["latestTransDate"] = pd.to_datetime(df["latestTransDate"], unit="s")
|
||||
if "positionDirectDate" in df:
|
||||
df["positionDirectDate"] = pd.to_datetime(df["positionDirectDate"], unit="s")
|
||||
if "latestTransDate" in df:
|
||||
df["latestTransDate"] = pd.to_datetime(df["latestTransDate"], unit="s")
|
||||
|
||||
df.rename(columns={
|
||||
"name": "Name",
|
||||
|
||||
@@ -7,7 +7,7 @@ import requests
|
||||
|
||||
from yfinance import utils
|
||||
from yfinance.data import YfData
|
||||
from yfinance.const import quote_summary_valid_modules, _BASE_URL_
|
||||
from yfinance.const import quote_summary_valid_modules, _BASE_URL_, _QUERY1_URL_
|
||||
from yfinance.exceptions import YFDataException, YFException
|
||||
|
||||
info_retired_keys_price = {"currentPrice", "dayHigh", "dayLow", "open", "previousClose", "volume", "volume24Hr"}
|
||||
@@ -590,33 +590,56 @@ class Quote:
|
||||
return None
|
||||
return result
|
||||
|
||||
def _fetch_additional_info(self, proxy):
|
||||
params_dict = {"symbols": self._symbol, "formatted": "false"}
|
||||
try:
|
||||
result = self._data.get_raw_json(f"{_QUERY1_URL_}/v7/finance/quote?",
|
||||
user_agent_headers=self._data.user_agent_headers,
|
||||
params=params_dict, proxy=proxy)
|
||||
except requests.exceptions.HTTPError as e:
|
||||
utils.get_yf_logger().error(str(e))
|
||||
return None
|
||||
return result
|
||||
|
||||
def _fetch_info(self, proxy):
|
||||
if self._already_fetched:
|
||||
return
|
||||
self._already_fetched = True
|
||||
modules = ['financialData', 'quoteType', 'defaultKeyStatistics', 'assetProfile', 'summaryDetail']
|
||||
result = self._fetch(proxy, modules=modules)
|
||||
result.update(self._fetch_additional_info(proxy))
|
||||
if result is None:
|
||||
self._info = {}
|
||||
return
|
||||
|
||||
result["quoteSummary"]["result"][0]["symbol"] = self._symbol
|
||||
query1_info = next(
|
||||
(info for info in result.get("quoteSummary", {}).get("result", []) if info["symbol"] == self._symbol),
|
||||
None,
|
||||
)
|
||||
# Most keys that appear in multiple dicts have same value. Except 'maxAge' because
|
||||
# Yahoo not consistent with days vs seconds. Fix it here:
|
||||
for k in query1_info:
|
||||
if "maxAge" in query1_info[k] and query1_info[k]["maxAge"] == 1:
|
||||
query1_info[k]["maxAge"] = 86400
|
||||
query1_info = {
|
||||
k1: v1
|
||||
for k, v in query1_info.items()
|
||||
if isinstance(v, dict)
|
||||
for k1, v1 in v.items()
|
||||
if v1
|
||||
}
|
||||
query1_info = {}
|
||||
for quote in ["quoteSummary", "quoteResponse"]:
|
||||
if quote in result:
|
||||
result[quote]["result"][0]["symbol"] = self._symbol
|
||||
query_info = next(
|
||||
(info for info in result.get(quote, {}).get("result", [])
|
||||
if info["symbol"] == self._symbol),
|
||||
None,
|
||||
)
|
||||
if query_info:
|
||||
query1_info.update(query_info)
|
||||
|
||||
# Normalize and flatten nested dictionaries while converting maxAge from days (1) to seconds (86400).
|
||||
# This handles Yahoo Finance API inconsistency where maxAge is sometimes expressed in days instead of seconds.
|
||||
processed_info = {}
|
||||
for k, v in query1_info.items():
|
||||
|
||||
# Handle nested dictionary
|
||||
if isinstance(v, dict):
|
||||
for k1, v1 in v.items():
|
||||
if v1 is not None:
|
||||
processed_info[k1] = 86400 if k1 == "maxAge" and v1 == 1 else v1
|
||||
|
||||
elif v is not None:
|
||||
processed_info[k] = v
|
||||
|
||||
query1_info = processed_info
|
||||
|
||||
# recursively format but only because of 'companyOfficers'
|
||||
|
||||
def _format(k, v):
|
||||
@@ -631,9 +654,8 @@ class Quote:
|
||||
else:
|
||||
v2 = v
|
||||
return v2
|
||||
for k, v in query1_info.items():
|
||||
query1_info[k] = _format(k, v)
|
||||
self._info = query1_info
|
||||
|
||||
self._info = {k: _format(k, v) for k, v in query1_info.items()}
|
||||
|
||||
def _fetch_complementary(self, proxy):
|
||||
if self._already_fetched_complementary:
|
||||
|
||||
@@ -1 +1 @@
|
||||
version = "0.2.52"
|
||||
version = "0.2.54"
|
||||
|
||||
Reference in New Issue
Block a user