logo
Phonetic Bad Word Filter
FREEMIUM
By mike.nichols
Updated 8 months ago
Phonetic Bad Word Filter Overview
This filter not only blocks bad words as specified by the user, but it also blocks words that 'sound' like those words. Phonetic filtering allows for the safe use of user generated content for public displays and text to speech systems, such as TTS for youtube and twitch streaming.
provider
rating
add first rating
Followers on API
Follow this API
resourcesTerms of Service
More Details

Phonetic Filter Documentation

This is not your standard bad word filter. This filter blocks words if they 'sound' like your bad word. This method blocks attempts to bypass your filter.

This filter is useful when you have a list of user generated content, you want to pick one for Text to Speech or display, and you want to be sure there is no hate speech.

I originally created this filter to allow me to have unattended Text to Speech on Twitch. My TTS system has been very successful at blocking all attempts to say the 'n' word.

This API will allow other developers to include safe content filtering in their projects as well.

This API can be configured to block any words required.

Phrase:
Required
1-1500 characters

Phonetic List:
Optional
0-500 items
String array of words you want to block if an input word sounds like it.

  • Only place your highly sensitive words you want to block here. Adding too many could cause false positives.
  • The filter is tuned to block the n-word, some n-word filtering is set by default

Whitelist:
Optional
0-5000 items
Words you have blocked with your phonetic list that you would like to unblock

Blacklist:
Optional
0-5000 items
Words you would like to block based on an exact match

  • Regex enabled, JS flavor

----

The filter is inteded for blocking racism and hate speech, but the sample below is more polite :)

Sample request:

{
    "phrase":"Look I got a new kar!" ,
    "phoneticlist":["car","heck"],
    "whitelist":["youtube","twitch"],
    "blacklist":["ni.*rs","hell"]
}

Sample Response:

{
    "code": 5,
    "summary": "Exact phonetic code match found kar matched with car",
    "source-phrase": "Look I got a new kar!",
    "matched-word": "kar",
    "matched-phonetic-word": "car",
    "censoredPhrase": "Look I got a new ****!"
}

Response codes:

1 - Phrase is good, no bad words found
2 - Single word phonetic match
3 - Word combo phonetic match
4 - Blacklist match
5 - Phonetic exact match
6 - message over 1500 characters
7 - unrecognized characters
8 - leet speak found, numbers in words 1
9 - blacklist too many items
10 - whitelist too many items
11 - phonetic list too many items
12 - message empty
13 - Russian TTS characters found

1 - Phrase is good, no bad words found
The phrase is safe based on the filter settings provided in the request. Some n-word checking is set as a default.

2 - Single word phonetic match
One of the words in the phrase matched phonetically with a word on your phoneticlist array

3 - Word combo phonetic match
Two words from the input phrase placed together were found to match phonetically with one of the words on your phoneticlist array. Example if you want to block "Nagger" and the user puts "Nag gur", it will be blocked. This is a typical attempt users make to bypass a filter.

4 - Blacklist match
A word in the phrase matches with one of your blacklist items. Blacklist items can be plain text or regex (JS Flavor)

5 - Phonetic exact match
A word in the phrase was an exact text match with a word on your phonetic list.

6 - Phrase over 1500 characters

7 - Unrecognized characters
A word was found that was not a proper English 1 letter word. This is to block attempts such as n a g g e r where a space is added after each letter to defeat a filter

8 - leet speak found that triggers the filter
Example 'wh4t a c00l guy'

9 - Blacklist has too many items
You can set up to 5000 items in the blacklist array

10 - Whitelist too many items
You can set up to 5000 items in the whitelist array

11 - Phonetic list too many items
You can set up to 500 items in the phonetic list array. Adding a lot of items will increase the response time, the phonetic checking is the most CPU expensive part of the call.
I strongly reccomend only placing your top priority words in this list and using the blacklist for the lower prirotiy items.
Good English words bypass phonetic checking by default, but if you add too many items you may start to block names and places.
I reccomend just the N word and the F**got word, these are the top priroty words to Twitch and YouTube.

12 - Phrase empty
An input phrase is required

13 - Russian TTS characters found
This is for a specific Russian character exploit for TTS. If you have a TTS engine say ниггер it does not sound good :)

Have a question about this API?Ask the API Provider.
Developers who viewed Phonetic Bad Word Filter also viewed
9.8
538ms
100%
9.6
534ms
88%
9.5
721ms
99%
9.7
1076ms
100%

Install SDK for (Node.js)Unirest

OAuth2 Authentication
Client ID
Client Secret
OAuth2 Authentication