PyEGo: Inferring Environment Dependencies for Python Programs
EGo-system can be visited at , and here is the README of PyEGo command line tool
PyEGo is a tool of automatically inferring environment dependencies for Python programs.
A Python program’s environment dependencies mainly consists of three parts:
Compatible Python interpreter version;
Dependent Python third-party packages;
Dependent System libraries.
For example, the following snippet print emoji on the terminal:
import emoji
print emoji.emojize('Python is :thumbs_up:')
This snippet is only compatible with Python2, because there are no parentheses after “print”.
If we run the snippet in Python3:
$ python example/
File "example/", line 2
print emoji.emojize('Python is :thumbs_up:')
SyntaxError: invalid syntax
On the other hand, the snippet depends on a Python third-party package emoji.
If we run the snippet without installing emoji:
$ python
Traceback (most recent call last):
File "example/", line 1, in <module>
import emoji
ImportError: No module named emoji
PyEGo can build a runtime environment for the snippet:
$ python -r example/
And then, output a Dockerfile:
FROM python:2.7
RUN sed -i s@/ /etc/apt/sources.list
RUN apt-get clean
RUN apt-get update
RUN pip install --upgrade pip
RUN pip config set global.index-url
RUN pip install emoji==0.6.0
# add CMD command to run your programs here
Add CMD instruction to run the snippet, build docker image:
$ echo "CMD python" >> example/Dockerfile
$ cd example
$ docker build -t ego .
Now, run it!
$ docker run ego
Python is 👍
Install local
Install Python>=3.6
Install dependent Python packages:
$ pip install -r requirements.txt
Install NEO4J>=3.5.13, <4
Merge PyKG:
Our knowledge graph, PyKG, is split into 2 files because of file size limit, merge them before load it:
Program root can be either a single .py file or a Python project folder.
PyEGo provides two types of output: Dockerfile, and dependency.json. Default output type is Dockerfile.
For a Dockerfile output, set –output_type=Dockerfile(-t Dockerfile), and for a json output, set –output_type=json.
–output_path(-p) indicate the output path of the Dockerfile or dependency.json. PyEGo generates the file in the parent folder of PROGRAM_ROOT by default.
For more help, see:
$ python -h
If you built Docker image of PyEGo, you can use PyEGo by:
$ python experiment/ --run
Compare PyEGo results with DockerizeMe and Pipreqs
Run DockerizeMe and Pipreqs
We provide our experiment bash script of DockerizeMe and Pipreqs
script/* uses DockerizeMe to generate Dockerfiles for gists. Note that run the script in DockerizeMe vagrant(Provided by DockerizeMe)
# Run the script in DockerizeMe vagrant
$ cd /PATH/TO/PyEGo/script
$ bash
script/* uses Pipreqs to generate requirements.txt and Dockerfiles for gists. Note that run the script after install pipreqs(pip install pipreqs) in Python2.7
# Edit line2 and line3: /YOUR/HARD/GISTS/ROOT/OF/PIPREQS
$ cd /PATH/TO/PyEGo/script
$ bash
script/* builds Docker images by DockerizeMe-generated or Pipreqs-generated Dockerfile, runs Docker containers, checks results and records results in log.txt.
# Edit line2: cd /YOUR/HARD/GISTS/ROOT
$ cd /PATH/TO/PyEGo/script
$ bash
PyEGo: Inferring Environment Dependencies for Python Programs
EGo-system can be visited at , and here is the README of PyEGo command line tool
PyEGo is a tool of automatically inferring environment dependencies for Python programs.
A Python program’s environment dependencies mainly consists of three parts:
For example, the following snippet print emoji on the terminal: This snippet is only compatible with Python2, because there are no parentheses after “print”. If we run the snippet in Python3: On the other hand, the snippet depends on a Python third-party package emoji. If we run the snippet without installing emoji: PyEGo can build a runtime environment for the snippet: And then, output a Dockerfile: Add CMD instruction to run the snippet, build docker image: Now, run it!
Install local
Edit, config neo4j connection:
We also provide a Docker image of PyEGo. Build Docker image by:Instructions
Start neo4j before running PyEGo:
If you installed PyEGo local, you can use PyEGo by:
For a Dockerfile output, set –output_type=Dockerfile(-t Dockerfile), and for a json output, set –output_type=json.
If you built Docker image of PyEGo, you can use PyEGo by:
Replay our experiment
Experiment on Hard-gists
Experimental results are available in another repository, exp-gist.
Run PyEGo on Hard-gists
Compare PyEGo results with DockerizeMe and Pipreqs
We provide our experiment bash script of DockerizeMe and Pipreqs
Experiment on Github dataset
Results of experiments are available in another repository, exp-github.
Download dataset
Our dataset is available on
Run PyEGo on Github dataset
Compare PyEGo results with DockerizeMe and Pipreqs
Install pipreqs in Python3.6+ Edit experiment/, config github dataset root and pipreqs path You can simply find pipreqs path by Run pipreqs We provide results of DockerizeMe in exp-github.
Experiment running PyEGo with different strategies
Results of experiments are available in exp-gist.