Use as a library
pyastgrep is structured internally to make it easy to use as a library as well as a CLI, with a clear separation of the different layers. For now, the following API is documented as public and we will strive to maintain backwards compatibility with it.
For other things, we while we will try not to break things without good reason, at this point we are not documenting or guaranteeing API stability for these functions. Please contribute to the discussion if you have needs here.
- pyastgrep.api.search_python_files(paths, expression, python_file_processor=process_python_file)
Searches for files with AST matching the given XPath
expression
, in the givenpaths
.If
paths
contains directories, then all Python files in that directory and below will be found, but.gitignore
and other rules are used to ignore files and directories automatically.Returns an iterable of
Match
object, plus other objects.The other objects are used to indicate errors, usually things like a failure to parse a file that had a
.py
extension. The details of these other objects are not being documented yet, so use at own risk, and ensure that you filter the results by doing anisinstance
check for theMatch
objects.By default,
search_python_files
does no caching of the conversion of Python to XML, which is appropriate for the normal command line usage. However, this conversion is relatively expensive, and for various use cases as a library, you might want to cache this operation.To achieve this, you can pass the
python_file_processor
argument. This value must be a callable that takes apathlib.Path
objects and returns aProcessedPython
object or aReadError
object.By default this is
process_python_file()
but an alternative can be provided, such asprocess_python_file_cached()
, or your own callable that typically will wrapprocess_python_file()
in some other way.- Parameters:
paths (list[pathlib.Path]) – List of paths to search, which can be files or directories, of type
pathlib.Path
expression (str) – XPath expression
python_file_processor – callable that takes a
pathlib.Path
objects and returns aProcessedPython
object or aReadError
object.
- Returns:
Iterable[Match | Any]
- class pyastgrep.api.Match
Represents a matched AST node. The public properties of this are:
- property path
The path of the file containing the match.
- Type:
- class pyastgrep.api.Position
- pyastgrep.api.process_python_file(path)
Default value of
python_file_processor
parameter above: a function that parses a Python file to create the AST and the XML version. This does no caching. You should not need to call this yourself.
- pyastgrep.api.process_python_file_cached(path)
Wrapper for
process_python_file()
that caches infinitely in memory, based on the input filename only.This can be an appropriate caching strategy:
if you are operating on a fairly limited number of Python files (or, if available memory is not a problem)
if you have a fairly short-lived process
if you don’t need to respond to on-disk changes to file contents for the life-time of the process.
- class pyastgrep.api.ProcessPython
Return type of
process_python_file()
. For now, this is an opaque type, as you should not need to construct this yourself – you should be wrappingprocess_python_file()
which will construct this for you.
- class pyastgrep.api.ReadError
Return type of
process_python_file()
for the case of error reading the file. This is again an opaque type for now.
Example
For example usage of search_python_files
, see the blog post pyastgrep and
custom linting.