Hello everybody! 👋 On this publish, I’m going to point out you ways you need to use the GitHub API to question Pull Requests, verify the content material of a PR and shut it.
The motivation for this undertaking got here from my private web site. I launched static feedback on the web site utilizing Staticman and solely after a day or two, acquired bombarded with spam. I hadn’t enabled Akismet or any honey pot discipline so it was kinda anticipated. Nonetheless, this resulted in me getting 200+ PRs on GitHub for bogus feedback which had been primarily commercials for amoxicillin (this was additionally the primary time I discovered how well-known this drugs is).
I used to be in no temper for going via the PRs manually so I made a decision to jot down a brief script which went via them on my behalf and closed the PRs which talked about sure key phrases.
You may see the totally different PRs opened by staticman. Most of those are spam:
For this undertaking, I made a decision to make use of PyGithub library. It’s tremendous straightforward to put in it utilizing pip:
pip set up pygithub
Now we will go forward and log in to GitHub utilizing PyGithub
. Write the next code in a github_clean.py
file:
from github import Github
import argparse
def parse_arguments():
"""
Parses arguments
"""
parser = argparse.ArgumentParser()
parser.add_argument('-u', '--username',
required=True, assist="GitHub username")
parser.add_argument('-p', '--password',
required=True, assist="GitHub password")
parser.add_argument('-r', '--repository',
required=True, assist="repository title")
parsed_args = parser.parse_args()
if "/" not in parsed_args.repository:
logging.error("repo title also needs to comprise username like: username/repo_name")
sys.exit()
return parsed_args
def principal():
args = parse_arguments()
g = Github(args.username, args.password)
if __name__ == '__main__':
principal()
To date I’m simply utilizing argparse to just accept and parse the command line arguments after which utilizing the arguments to create a Github
object.
You can be passing in three arguments:
- Your GitHub username
- Your GitHub password
- The repo you wish to work with
Subsequent step is to determine tips on how to loop via all of the pull requests and verify if their physique comprises any “spam” phrases:
repo = g.get_repo(args.repository)
points = repo.get_issues()
page_num = 0
whereas True:
issue_page = points.get_page(page_num)
if issue_page == []:
break
for situation in issue_page:
# Do one thing with the person situation
if spam_word in situation.raw_data['body'].decrease():
print("Comprises spam phrase!!")
First, we question GitHub for a selected repo utilizing g.get_repo
after which we question for points for that repo utilizing repo.get_issues
. It is very important notice that every one PRs are registered as points as properly so querying for points will return pull requests as properly. GitHub returns a paginated end result so we simply proceed asking for successive points shortly loop till we get an empty web page.
We are able to verify the physique of a problem (PR) utilizing situation.raw_data[‘body’]
. Two essential items are lacking from the above code. One is the spam_word
variable and one other is a few type of a mechanism to shut a problem.
For the spam_word
, I took a have a look at some points and created an inventory of some fairly frequent spam phrases. That is the record I got here up with:
spam_words = ["buy", "amoxi", "order", "tablets",
"pills", "cheap", "viagra", "forex", "cafergot",
"kamagra", "hacker", "python training"]
Add this record on the high of your github_clean.py
file and modify the if assertion like this:
closed = False
if any(spam_word in situation.raw_data['body'].decrease() for spam_word in spam_words):
situation.edit(state="closed")
closed = True
print(f"{situation.quantity}, closed: {closed}")
With this remaining snippet of code, now we have all the pieces we want. My favorite operate on this code snippet is any
. It checks if any of the weather being handed in as a part of the argument is True
.
That is what your complete file ought to seem like:
import argparse
import sys
import re
import logging
from github import Github
spam_words = ["buy", "amoxi", "order", "tablets",
"pills", "cheap", "viagra", "forex", "cafergot",
"kamagra", "hacker", "python training"]
logging.basicConfig(stage=logging.INFO)
def parse_arguments():
"""
Parses arguments
"""
parser = argparse.ArgumentParser()
parser.add_argument('-u', '--username',
required=True, assist="GitHub username")
parser.add_argument('-p', '--password',
required=True, assist="GitHub password")
parser.add_argument('-r', '--repository',
required=True, assist="repository title")
parsed_args = parser.parse_args()
if "/" not in parsed_args.repository:
logging.error("repo title also needs to comprise username like: username/repo_name")
sys.exit()
return parsed_args
def process_issue(situation):
"""
Processes every situation and closes it
based mostly on the spam_words record
"""
closed = False
if any(bad_word in situation.raw_data['body'].decrease() for bad_word in phrases):
situation.edit(state="closed")
closed = True
return closed
def principal():
"""
Coordinates the movement of the entire program
"""
args = parse_arguments()
g = Github(args.username, args.password)
logging.information("efficiently logged in")
repo = g.get_repo(args.repository)
logging.information("getting points record")
points = repo.get_issues()
page_num = 0
whereas True:
issue_page = points.get_page(page_num)
if issue_page == []:
logging.information("No extra points to course of")
break
for situation in issue_page:
closed = process_issue(situation)
logging.information(f"{situation.quantity}, closed: {closed}")
page_num += 1
if __name__ == '__main__':
principal()
I simply added a few various things to this script, just like the logging. In order for you, you possibly can create a brand new command-line argument and use that to manage the log stage. It isn’t actually helpful right here as a result of we don’t have a whole lot of totally different log ranges.
Now in the event you run this script it’s best to see one thing just like this:
INFO:root:efficiently logged in
INFO:root:getting points record
INFO:root:No extra points to course of
It doesn’t course of something on this run as a result of I’ve already run this script as soon as and there aren’t any extra spam points left.
So there you go! I hope you had enjoyable making this! In case you have any questions/feedback/solutions please let me know within the feedback beneath! See you within the subsequent publish 🙂 ♥️