Skip to content

Commit

Permalink
Merge pull request #84 from my-dev-app/fix/unittest-placement
Browse files Browse the repository at this point in the history
CHANGE: Temporarly disable filter option
  • Loading branch information
0x78f1935 authored Feb 1, 2025
2 parents d9faaf9 + 2d7f5ef commit c760c7f
Show file tree
Hide file tree
Showing 2 changed files with 56 additions and 49 deletions.
102 changes: 54 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
[![Stand With Ukraine](https://raw.githubusercontent.com/vshymanskyy/StandWithUkraine/main/banner2-direct.svg)](https://stand-with-ukraine.pp.ua)
---
## [![Stand With Ukraine](https://raw.githubusercontent.com/vshymanskyy/StandWithUkraine/main/banner2-direct.svg)](https://stand-with-ukraine.pp.ua)

```python
# ###### ######
# # # # ##### #### # # # # # # ###### # ## # #
# # # # # # # # # # # # # # # # # # # #
# # ###### # # # # ## # ###### ##### # # # #
####### # ##### # # ## # # # # # ###### #
# # # # # # # # # # # # # # # # #
# # # # # #### # # # # # ###### ###### # # #
# ###### ######
# # # # ##### #### # # # # # # ###### # ## # #
# # # # # # # # # # # # # # # # # # # #
# # ###### # # # # ## # ###### ##### # # # #
####### # ##### # # ## # # # # # ###### #
# # # # # # # # # # # # # # # # #
# # # # # #### # # # # # ###### ###### # # #

By undeƒined
-------------------------------------
```

![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/my-dev-app/proxy-relay/entrypoint.yaml?branch=development)
![PyPI - Version](https://img.shields.io/pypi/v/aproxyrelay)
![PyPI - Downloads](https://img.shields.io/pypi/dw/aproxyrelay)
Expand All @@ -30,14 +30,19 @@ In addition, tested proxies will be shared with other people using this library.

Our scraper, used to obtain proxies, is highly modular and plug-and-play, making it easy to contribute to.

**NOTE: Duo to Maintainance the "filter" option has been disabled temporarly**

## Usage

AProxyRelay streamlines the process of making asynchronous requests with proxy servers. It offers the following features:

- Asynchronously fetches lists of free proxies from various sources based on the provided zone
- Tests and shares proxies with other users of the library
- Identifies and discards bad proxies, preserving data for failed target requests
- Bypasses API limiters in an asynchronous manner (for educational purpose)

### Example

```py
# -*- mode: python ; coding: utf-8 -*-
from aproxyrelay import AProxyRelay
Expand Down Expand Up @@ -72,32 +77,31 @@ while not data.empty():
```

## A Proxy Relay: Installation

Simply run

pip install aproxyrelay

### Parameters

| Parameters | Type | Function | Description |
|-------------|---------------|------------------------------------------------|--------------------------------------------------------------|
| targets | list[str] | Target endpoints provided in an array | Each endpoint will be requested with an available proxy. If a proxy is unavailable and the request fails, we store it in a queue and try it out with another proxy until we have data. |
| timeout | int | Allowed proxy timeout. **Defaults to 5** | A proxy has to respond within the provided timeout to be considered valid. Otherwise, it will be discarded. |
| scrape | bool | Indicator to utilize the proxy scraper. **Defaults to True** | The decision to scrape for proxies is determined by the value of this parameter. When set to True (default), the proxy scraper is used, which is slower but provides a broader range of proxies. When set to False, proxies are fetched from a single source, offering a faster but more limited selection. |
| filter | bool | Indicator for filtering bad proxies. **Defaults to True** | If set to True (default), the tool will test proxy connections before using them. This process might take a bit longer, but it ensures that the proxies are valid before utilization. |
| zones | list[str] | An array of proxy zones. **Defaults to ['US']** | Sometimes it matters where the proxy is located. Each item in this list ensures the proxy is located in that specific zone, and requests made from the proxy are coming from the location provided. It acts like a whitelist for allowed proxy locations. |
| unpack | lambda | Anonymous function for unpacking data. **Defaults to `lambda data, target: data`** | When a request has been made to a target through a proxy and data has been fetched, this lambda method formats the result data before putting it into the result queue. **data** -> output from the target, **target** -> target URL. |
| debug | bool | Indicator which enables debug mode. **Defaults to False** | When true, additional logging will be printed to the terminal, enabling debug mode. |


| Parameters | Type | Function | Description |
| ---------- | --------- | ---------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| targets | list[str] | Target endpoints provided in an array | Each endpoint will be requested with an available proxy. If a proxy is unavailable and the request fails, we store it in a queue and try it out with another proxy until we have data. |
| timeout | int | Allowed proxy timeout. **Defaults to 5** | A proxy has to respond within the provided timeout to be considered valid. Otherwise, it will be discarded. |
| scrape | bool | Indicator to utilize the proxy scraper. **Defaults to True** | The decision to scrape for proxies is determined by the value of this parameter. When set to True (default), the proxy scraper is used, which is slower but provides a broader range of proxies. When set to False, proxies are fetched from a single source, offering a faster but more limited selection. |
| filter | bool | Indicator for filtering bad proxies. **Defaults to True** | If set to True (default), the tool will test proxy connections before using them. This process might take a bit longer, but it ensures that the proxies are valid before utilization. |
| zones | list[str] | An array of proxy zones. **Defaults to ['US']** | Sometimes it matters where the proxy is located. Each item in this list ensures the proxy is located in that specific zone, and requests made from the proxy are coming from the location provided. It acts like a whitelist for allowed proxy locations. |
| unpack | lambda | Anonymous function for unpacking data. **Defaults to `lambda data, target: data`** | When a request has been made to a target through a proxy and data has been fetched, this lambda method formats the result data before putting it into the result queue. **data** -> output from the target, **target** -> target URL. |
| debug | bool | Indicator which enables debug mode. **Defaults to False** | When true, additional logging will be printed to the terminal, enabling debug mode. |

## A Proxy Relay: Local Development

To install all library dependencies for local development, excluding the core code available locally, use the following command within a virtual environment:

pip install -e .[dev]

This command installs dependencies and removes the core code of AProxyRelay from the virtual environment.


# Contribute to AProxyRelay

AProxyRelay encourages contributions to enhance its capabilities by allowing users to create custom scrapers. These scrapers are designed to fetch proxy data from different sources. The process is straightforward, and here's a guide to help you get started:
Expand Down Expand Up @@ -140,33 +144,34 @@ class MainScraper(ScraperCore):
To contribute your own proxy scraper, follow these steps:

1. ### Create a new parser class inside the scrapers folder
- Inherit from the `MainScraper`.
- Overwrite the necessary methods required for scraping additional proxy servers.
- Inherit from the `MainScraper`.
- Overwrite the necessary methods required for scraping additional proxy servers.
2. ### Methods to Overwrite:
- `format_url`: Manipulate the proxy list request URL before making a request, enabling adjustment of various query parameters.
- `format_raw`: When the data obtained from the link is `txt/html`, this method should scrape the data and format it into workable data.
- `format_data`: This method is triggered when the call to the proxy list returns a dictionary, or when format_raw has been completed.
- For a full overview of the available methods to overwritte see: [Scrapers](./.github/docs/Scrapers.md)
- `format_url`: Manipulate the proxy list request URL before making a request, enabling adjustment of various query parameters.
- `format_raw`: When the data obtained from the link is `txt/html`, this method should scrape the data and format it into workable data.
- `format_data`: This method is triggered when the call to the proxy list returns a dictionary, or when format_raw has been completed.
- For a full overview of the available methods to overwritte see: [Scrapers](./.github/docs/Scrapers.md)
3. ### Formatting Data:
- Your goal is to format the data in the `format_data` method and place it into the provided Queue. The data should be structured as follows:
```python
data = {
"zone": "US",
"method": "http",
"anonymity": "anonymous",
"protocol": "http",
"port": "8080",
"ip": "127.0.0.1",
}
queue.put(data)
```
- Your goal is to format the data in the `format_data` method and place it into the provided Queue. The data should be structured as follows:
```python
data = {
"zone": "US",
"method": "http",
"anonymity": "anonymous",
"protocol": "http",
"port": "8080",
"ip": "127.0.0.1",
}
queue.put(data)
```
4. ### Congratulations
- If done correctly, congratulations! You've successfully created a new proxy parser for this library.
- Add the targeted link and your scraper to `scrapers/__init__.py`.
- If done correctly, congratulations! You've successfully created a new proxy parser for this library.
- Add the targeted link and your scraper to `scrapers/__init__.py`.

Feel free to contribute, share your improvements, and expand the library's capabilities. Your efforts contribute to a growing pool of available proxies for the AProxyRelay community.

## Compiling to package

To compile the library into a package, use the following command after installing `requirements.txt`:

```sh
Expand All @@ -175,17 +180,18 @@ python setup.py sdist bdist_wheel

This will generate the package in the `dist` folder.

***Note: A custom version can be set with the environment variable `CUSTOM_VERSION`***
**_Note: A custom version can be set with the environment variable `CUSTOM_VERSION`_**

## Versioning

The version public on [PyPi](https://pypi.org/project/aproxyrelay/) contains a version number based on the pipeline builds and looks like:

` aproxyrelay 1.104.1rc7696232336 `

Breaking the version down:

- `aproxyrelay`: The name of the package
- `1`: Major version
- `104`: Build number
- `1`: Build retry number
- `rc7696232336`: Random build number
- `aproxyrelay`: The name of the package
- `1`: Major version
- `104`: Build number
- `1`: Build retry number
- `rc7696232336`: Random build number
3 changes: 2 additions & 1 deletion aproxyrelay/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,8 @@ async def get_proxies(self) -> None:
self.logger.info(f'[aProxyRelay] Validating: Proxies ({self._queue_filter.qsize()}), checking if proxies meet connection requirements ...') # noqa: B950
self.logger.info(f'[aProxyRelay] Keep an eye on "pending task name" once it reaches ({self._queue_filter.qsize()}) all tests have been completed') # noqa: B950
self.logger.info('[aProxyRelay] Grab some coffee ... please wait ...')
await self._test_all_proxies()
if False: # TODO: This method is temporary disabled
await self._test_all_proxies()
self.logger.info(f'[aProxyRelay] Filter: Found {self._filtered_failed} incompetent and {self._filtered_available} available proxy servers in {datetime.now(UTC) - self.started}') # noqa: B950
else:
while not self._queue_filter.empty():
Expand Down

0 comments on commit c760c7f

Please sign in to comment.