Skip to content

Commit 239fed6

Browse files
authored
Merge branch 'buriy:master' into master
2 parents 5ecf51d + b220919 commit 239fed6

17 files changed

Lines changed: 230 additions & 176 deletions
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
2+
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python
3+
4+
name: Python package
5+
6+
on:
7+
push:
8+
branches: [ "master" ]
9+
pull_request:
10+
branches: [ "master" ]
11+
12+
jobs:
13+
build:
14+
15+
runs-on: ubuntu-latest
16+
strategy:
17+
fail-fast: false
18+
matrix:
19+
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12", "3.13"]
20+
21+
steps:
22+
- uses: actions/checkout@v4
23+
- name: Set up Python ${{ matrix.python-version }}
24+
uses: actions/setup-python@v3
25+
with:
26+
python-version: ${{ matrix.python-version }}
27+
- name: Install dependencies
28+
run: |
29+
python -m pip install --upgrade pip
30+
python -m pip install flake8 pytest
31+
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
32+
- name: Lint with flake8
33+
run: |
34+
# stop the build if there are Python syntax errors or undefined names
35+
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
36+
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
37+
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
38+
- name: Test with pytest
39+
run: |
40+
pytest
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# This workflow will upload a Python Package using Twine when a release is created
2+
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries
3+
4+
# This workflow uses actions that are not certified by GitHub.
5+
# They are provided by a third-party and are governed by
6+
# separate terms of service, privacy policy, and support
7+
# documentation.
8+
9+
name: Upload Python Package
10+
11+
on:
12+
release:
13+
types: [published]
14+
15+
permissions:
16+
contents: read
17+
18+
jobs:
19+
deploy:
20+
21+
runs-on: ubuntu-latest
22+
23+
steps:
24+
- uses: actions/checkout@v3
25+
- name: Set up Python
26+
uses: actions/setup-python@v3
27+
with:
28+
python-version: '3.x'
29+
- name: Install dependencies
30+
run: |
31+
python -m pip install --upgrade pip
32+
pip install build
33+
- name: Build package
34+
run: python -m build
35+
- name: Publish package
36+
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
37+
with:
38+
user: __token__
39+
password: ${{ secrets.PYPI_API_TOKEN }}

.travis.yml

Lines changed: 0 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -4,47 +4,16 @@ cache: pip
44

55
matrix:
66
include:
7-
- name: "Python 2.7 on Linux"
8-
python: 2.7
9-
env: PIP=pip
10-
- name: "Python 3.5 on Linux"
11-
python: 3.5
12-
- name: "Python 3.6 on Linux"
13-
python: 3.6
14-
- name: "Python 3.7 on Linux"
15-
python: 3.7
167
- name: "Python 3.8 on Linux"
178
dist: xenial
189
python: 3.8
1910
- name: "Python 3.9 Nightly on Linux"
2011
dist: bionic
2112
python: nightly
22-
- name: "Pypy on Linux"
23-
python: pypy
24-
env: PIP=pip
2513
- name: "Pypy 3 on Linux"
2614
python: pypy3
27-
- name: "Python 3.7 on older macOS"
28-
os: osx
29-
osx_image: xcode9.4
30-
language: shell
31-
env: TOXENV=py37
32-
before_install:
33-
- sw_vers
34-
- python3 --version
35-
- pip3 --version
36-
- name: "Python 3.7 on macOS"
37-
os: osx
38-
osx_image: xcode11
39-
language: shell
40-
env: TOXENV=py37
41-
before_install:
42-
- sw_vers
43-
- python3 --version
44-
- pip3 --version
4515
allow_failures:
4616
- python: nightly
47-
- python: pypy
4817
- python: pypy3
4918
- os: osx
5019

README.rst

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@
66
python-readability
77
==================
88

9-
Given a html document, it pulls out the main body text and cleans it up.
9+
Given an HTML document, extract and clean up the main body text and title.
1010

11-
This is a python port of a ruby port of `arc90's readability
12-
project <http://lab.arc90.com/experiments/readability/>`__.
11+
This is a Python port of a Ruby port of `arc90's Readability
12+
project <https://web.archive.org/web/20130519040221/http://www.readability.com/>`__.
1313

1414
Installation
1515
------------
@@ -35,7 +35,7 @@ Usage
3535
>>> from readability import Document
3636
3737
>>> response = requests.get('http://example.com')
38-
>>> doc = Document(response.text)
38+
>>> doc = Document(response.content)
3939
>>> doc.title()
4040
'Example Domain'
4141
@@ -49,6 +49,7 @@ Usage
4949
Change Log
5050
----------
5151

52+
- 0.8.2 Added article author(s) (thanks @mattblaha)
5253
- 0.8.1 Fixed processing of non-ascii HTMLs via regexps.
5354
- 0.8 Replaced XHTML output with HTML5 output in summary() call.
5455
- 0.7.1 Support for Python 3.7 . Fixed a slowdown when processing documents with lots of spaces.
@@ -70,6 +71,6 @@ Thanks to
7071
- Latest `readability.js <https://github.com/MHordecki/readability-redux/blob/master/readability/readability.js>`__
7172
- Ruby port by starrhorne and iterationlabs
7273
- `Python port <https://github.com/gfxmonk/python-readability>`__ by gfxmonk
73-
- `Decruft effort <http://www.minvolai.com/blog/decruft-arc90s-readability-in-python/>` to move to lxml
74+
- `Decruft effort <https://web.archive.org/web/20110214150709/https://www.minvolai.com/blog/decruft-arc90s-readability-in-python/>` to move to lxml
7475
- "BR to P" fix from readability.js which improves quality for smaller texts
7576
- Github users contributions.

doc/source/conf.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
#!/usr/bin/env python3
2-
# -*- coding: utf-8 -*-
32
#
43
# readability documentation build configuration file, created by
54
# sphinx-quickstart on Thu Mar 23 16:29:38 2017.
@@ -38,7 +37,7 @@
3837
"sphinx.ext.doctest",
3938
"sphinx.ext.intersphinx",
4039
"sphinx.ext.todo",
41-
"recommonmark",
40+
"myst_parser",
4241
]
4342

4443
# Add any paths that contain templates here, relative to this directory.
@@ -72,7 +71,7 @@
7271
#
7372
# This is also used if you do content translation via gettext catalogs.
7473
# Usually you set "language" from the command line for these cases.
75-
language = None
74+
language = "en"
7675

7776
# List of patterns, relative to source directory, that match files and
7877
# directories to ignore when looking for source files.

readability/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
__version__ = "0.8.1.1"
1+
__version__ = "0.8.3"
22

33
from .readability import Document

readability/compat/__init__.py

Lines changed: 0 additions & 26 deletions
This file was deleted.

readability/compat/three.py

Lines changed: 0 additions & 6 deletions
This file was deleted.

readability/compat/two.py

Lines changed: 0 additions & 6 deletions
This file was deleted.

readability/encoding.py

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -39,11 +39,10 @@ def get_encoding(page):
3939
for declared_encoding in declared_encodings:
4040
try:
4141
# Python3 only
42-
if sys.version_info[0] == 3:
43-
# declared_encoding will actually be bytes but .decode() only
44-
# accepts `str` type. Decode blindly with ascii because no one should
45-
# ever use non-ascii characters in the name of an encoding.
46-
declared_encoding = declared_encoding.decode("ascii", "replace")
42+
# declared_encoding will actually be bytes but .decode() only
43+
# accepts `str` type. Decode blindly with ascii because no one should
44+
# ever use non-ascii characters in the name of an encoding.
45+
declared_encoding = declared_encoding.decode("ascii", "replace")
4746

4847
encoding = fix_charset(declared_encoding)
4948
# Now let's decode the page

0 commit comments

Comments
 (0)