Skip to content

When PYINSP_INDEX_URL is defined, package metadata are not fetched #260

@nnobelis

Description

@nnobelis

Consider the following example:

$ PYINSP_INDEX_URL=https://my_artifactory/simple python-inspector --python-version 313 --operating-system linux --json-pdt /tmp/pi.json --analyze-setup-py-insecurely --setup-py setup.py  --verbose 
Resolving dependencies...
Using netrc file /home/user/.netrc
direct_dependencies:
 DependentPackage(purl='pkg:pypi/jsonschema', extracted_requirement='jsonschema', scope='install')
environment: Environment(python_version='313', operating_system='linux')
repos:
 PypiSimpleRepository(index_url='https://my_artifactory/simple', credentials=None)
versions:
  retrieved versions for package 'jsonschema'
dependencies:
  retrieved dependencies for requirement 'pkg:pypi/jsonschema'
retrieve package data from pypi:
  retrieved package 'pkg:pypi/[email protected]'
  retrieved package 'pkg:pypi/[email protected]'
  retrieved package 'pkg:pypi/[email protected]'
  retrieved package 'pkg:pypi/[email protected]'
  retrieved package 'pkg:pypi/[email protected]'
done!

 $ jq .packages[].name < /tmp/pi.json 

<no answer>

And compare it to the working scenario, without PYINSP_INDEX_URL:

$ python-inspector --python-version 313 --operating-system linux --json-pdt /tmp/pi.json --analyze-setup-py-insecurely --setup-py setup.py  --verbose 
Resolving dependencies...
Using netrc file /home/user/.netrc
direct_dependencies:
 DependentPackage(purl='pkg:pypi/jsonschema', extracted_requirement='jsonschema', scope='install')
environment: Environment(python_version='313', operating_system='linux')
repos:
 PypiSimpleRepository(index_url='https://pypi.org/simple', credentials=None)
versions:
  retrieved versions for package 'jsonschema'
dependencies:
  retrieved dependencies for requirement 'pkg:pypi/jsonschema'
retrieve package data from pypi:
  retrieved package 'pkg:pypi/[email protected]'
  retrieved package 'pkg:pypi/[email protected]'
  retrieved package 'pkg:pypi/[email protected]'
  retrieved package 'pkg:pypi/[email protected]'
  retrieved package 'pkg:pypi/[email protected]'
done!

$ jq .packages[].name < /tmp/pi.json 
"attrs"
"jsonschema-specifications"
"jsonschema"
"referencing"
"rpds-py"

All these packages are present in https://my_artifactory/simple.

After debugging, the problematic code has been found to be:

for dist_url in valid_distribution_urls:
if dist_url not in urls:
continue

Here, valid_distribution_urls are:

['https://my_artifactory/simple/../3a/2a/7cc015f5b9f5db42b7d48157e23356022889fc354a2813c15934b7cb5c0e/attrs-25.4.0-py3-none-any.whl',
'https://my_artifactory/simple/../6b/5c/685e6633917e101e5dcb62b9dd76946cbb57c26e133bae9e0cd36033c0a9/attrs-25.4.0.tar.gz']

Whereas urls are:

{
'https://files.pythonhosted.org/packages/3a/2a/7cc015f5b9f5db42b7d48157e23356022889fc354a2813c15934b7cb5c0e/attrs-25.4.0-py3-none-any.whl': {XXX},
'https://files.pythonhosted.org/packages/6b/5c/685e6633917e101e5dcb62b9dd76946cbb57c26e133bae9e0cd36033c0a9/attrs-25.4.0.tar.gz': {XXX}
}

The problem is that urls is always built with the URLs coming from the API response of https://pypi.org/pypi. Therefore this check for the distribution URLs is always falling, and get_pypi_data_from_purl returns None for this package.
Consequence: the package is present in the dependency tree but is missing in the packages array of the output file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions