Methodology — how we collect jobs

📥 Where the jobs come from

27 Telegram channels with job ads for the CPA industry. We have no partnership with these channels — they are public sources we crawl once an hour.

The list has been built by hand since 2018. We have watched the CPA community for a long time and only add a new channel after manual screening. Simple rule: in 30 days the channel must publish at least 5 real job ads with a specific company, role and contacts. If a channel is 80% reposts of paid courses, crypto-wallet ads or «buying FB account» posts — pass.

The current source list (HR category):

@adhunt_cpa_job, @aff_hunter, @aff_job, @arbitrage_work, @arbitrazh_vakansii_rabota, @be01team, @cpa_traffic_hr, @gamblingservices, @headshotagency, @hr_affiliate, @hr_arbitraz, @hr_b00st, @hrcpa, @hrcpagram, @job4aff, @job_cpa, @kashjob, @mediabuyers_lenkep, @opento_igaming, @partnerkin_job, @pro_vacancy, @r2bwork, @recruitingtc, @talents, @traff_job, @works_affiliate, @works_cpa

Among them are official HR accounts of companies (Parimatch Talents, LENKEP, Partnerkin), independent HR agencies (CPA Hunter, Headshot, Adhunt), and aggregators (CPA WORK, Affiliate Work). Every job card on the site shows the primary source — which channel posted it and when.

🔧 How the parser works

Sitting on 27 channels from a single account is a fast track to a ban — Telegram treats that as suspicious activity. Instead we run a pool of 15 accounts through the Telethon library (Python). They take turns round-robin: one account pulls posts from five channels, the next pulls from another five, and so on.

Every 60 minutes the parser walks the feed, picks up new posts, runs them through a set of regular expressions and writes the results into the careers.jobs table. If a channel posted nothing new — we skip it. If it posted spam or an «account for sale» ad — we filter that out at the very first stage.

The parser does not publish, does not reply in private chats, does not like posts. It only reads public channels — exactly what any subscriber does when they open Telegram in the morning.

🔄 Deduplication — why one job does not show up 14 times

CPA agencies love to fan a job out: first to their friends in @hrcpa, then to @aff_job, then to @works_cpa, then to three more HR chats just in case. The same text floats up 5 to 15 times. On the site that would look like trash.

We compute a canonical hash from normalized content: strip emoji, lowercase everything, drop whitespace and punctuation, and keep only the meaningful words in the title, company name and description. If the hash matches an existing record — it is a duplicate. We bump the existing record's source_count and add the new source to its «also published in» list.

A real example from last week. Channel @aff_job on April 14 posted «Looking for a Media Buyer, FB Gambling, budget from $50K, remote». Channel @works_cpa on April 15 publishes the very same text letter-for-letter with a different emoji in the title. Channel @hrcpa on April 16 adds «UPD: urgent, need someone today». The hash function strips the emoji and the UPD prefix, sees the identical remainder, and merges them into one record with source_count=3.

Right now the database has a single job with source_count=24 (the same Media Buyer position surfaced in 24 channels in a month), and several with source_count=15. Without dedup the feed would be roughly 2–3 times longer and 60% of it would be repeats.

🏷 Classification — what we pull out of the text

A raw post can be anywhere from 200 characters («Need a buyer, +$») to 5,000 («Full role description, KPIs, offer, quarterly performance bonuses, our company story over two screens»). For the filter to work, we extract a clean structure:

Vertical — iGaming, Crypto, Nutra, Finance, E-commerce, Agency. Stored as an array in vertical_tags: a single job can cover iGaming + Crypto at the same time.
Role — Affiliate Manager, Media Buyer, BizDev, Developer, Designer, HR Recruiter, Sales. Stored in role_tags.
Seniority — junior / middle / senior / lead. One level per job.
Format — remote / office / hybrid.
Location — we normalize «Cyprus, Limassol» and «Limassol Cyprus» into a single record.
Salary — we try to extract the range, currency and period (hour / day / month / year). If the text says «$3000–5000 net» we record 3000–5000 USD/month.
Skills — the list of tools, stored in external_data->skills (FB Ads, Keitaro, Bemob, Voluum, Binom, etc).

All of this is extracted via two paths. The first is a regex parser: cheap, fast, rarely wrong on standard patterns («Salary from $3000» → 3000 USD). The second is a fallback to an LLM, used when regex returns low confidence or finds nothing at all (dense unstructured prose).

After extraction each job gets a quality_score between 0 and 100. The more fields are filled in (salary, location, contacts, description over 200 characters, normalized company name), the higher the score. On the homepage we surface those with quality_score ≥ 70.

🤖 AI moderation — where regex stops and Claude takes over

Regex handles a clean format well. It does not handle masked spam, hidden course promos and résumé posts from job seekers. So everything where the parser is uncertain goes into a queue for the LLM (we use Claude Sonnet from Anthropic).

The prompt is strict: the model must return JSON with one of three decisions — publish, reject, review. Plus a short justification (for logs and manual audit). No creative free-form replies.

A case from March. Channel @aff_job on March 12 posted: «Looking for a team lead on a casino traffic team, salary $25K, remote, no experience required». The parser flagged low confidence — no company, no specifics, no reasonable experience requirements. Claude Sonnet returned reject with the rationale: «the phrasing 'no experience required for $25K with no duties listed' is a typical pattern for harvesting contacts or laundering a candidate database; the rate does not match the market for a junior role». The decision was applied automatically and the job never entered the public feed.

In March LLM moderation rejected roughly 7% of the posts that had already passed the regex filter. Among them: disguised course ads («media buying course for $99, we will place you in a job after graduation»), résumé-harvesting attempts («send your CV to us in DM, we will forward it to the employer»), and obvious pyramid schemes.

Decisions with ≥75% confidence (publish) or ≥80% confidence (reject) are applied automatically. Anything below that goes to manual moderation. The manual queue is reviewed once a day, usually with 5–15 cases waiting.

⏰ Life cycle — why jobs disappear after 30 days

Every job lives 30 days from its last appearance in the source. If it stops appearing — we close it automatically and switch it to status=expired. The old page is still reachable by direct link, but gets noindex,follow and drops out of search.

This window is not pulled out of thin air. We looked at the typical lifespan of jobs in iGaming/CPA: companies keep an active hiring round open for 21–28 days on average, then either fill the role or revise the requirements. 30 days is a comfortable buffer that avoids closing a live position too early.

A real example. The job «Media Buyer FB Gambling CTEAM» was published on February 1. Over the next 30 days it was republished across our sources 19 more times — meaning the company is still hiring. We keep it in active with source_count=19. If 30 days had passed without a single source mentioning it again — it would have moved itself to the archive.

Right now the database holds: 563 active jobs and roughly 2,800 archived across the entire history since late 2024. The archive stays for retrospective use — to see which companies were hiring for what, what salaries the industry was paying. But the main feed only shows live positions.

⚖️ What we DO NOT do

Job catalogs have many ways to monetize traffic on top of the core product. We reject all of them:

We do not charge employers for posting or for priority placement. All jobs are equal, ranking is strictly by quality_score and freshness.
We do not take a placement fee. We are not an HR agency — we are a catalog. No «$2K per placement» from the company side.
We do not charge candidates. Browsing, applying, any activity — free. We will never put up a paywall to view contact details.
We do not publish «grey-market» jobs — account farming, multi-accounts, cash-out schemes, money muling — even when such ads are present in the source channel.
We do not show Google AdSense or third-party banner ads inside job cards, and we do not mix in native ads.
We do not run spam campaigns to candidates on Telegram. If we have your contact — it is only for notifications on subscriptions you have explicitly opted into.
We do not pass company contact details to third parties. Whatever is in a public card was already public in the source channel.

This is not a moral stance, it is a technical choice: adding any one of these would turn the catalog into yet another junk site full of inflated salaries and fake companies within a month. There are plenty of examples of how that goes.

👥 Who runs this

Affiliate.Careers is part of an independent project built by an affiliate-marketing community. No outside investors, no advertising contracts with specific CPA networks.

The team works by role: platform engineering (Laravel + Postgres + Telethon parser), moderation of edge cases, maintaining the list of source channels, user support. Everything is small and non-corporate by design. The smaller the team, the lower the temptation to sell «priority placements» or accept «partnership offers» from dubious recruiters.

Contact for corrections, job removal, spam complaints — at the bottom of the page.

🛠 If an employer asks to remove a job

It happens: a company has closed the role, does not want to be visible, changed its mind. Our procedure is:

We verify the request comes from someone connected to the company. The simplest way — a message from the same Telegram channel where we picked up the job (if @hrcpa asks for removal — they are the ones who published it). An alternative — an email from an address on the company's domain.
If the connection is confirmed — we remove the job manually within 24 hours. Fully removed, not «hidden»: the record is deleted from careers.jobs and the page starts returning 410 Gone.
If the connection is not confirmed — we reply with the specific proof we need. We do not remove jobs based on requests from unrelated accounts — otherwise the scheme degrades into «found a competitor's job — write to have it removed».

A removed job does not come back even if it is republished in the sources. The record is tagged removed_by_request=true, and the parser ignores the canonical_hash on any reappearance.

📩 Found a problem?

A garbled description, a typo, a job that has actually closed but is still showing as active, or just plain «this is not a job, this is spam»? Drop us a line on the contact page — we respond within an hour during working hours.