SOTOY DocumentationCurrent version 1.0.0

SOTOY is an Intelligent Spam Detection System provided by KMKLabs to filter out every spam comments in the websites. As a microservice SOTOY is already using machine learning implementation with complex vector space model. It works fast with accuracy more than 95%.

What make it stands out is that SOTOY's models are built by using a very large and incremental dataset. Sotoy is a smart and sophisticated system that is designed to learn from new dataset which is daily consumed by our traning system.Now this microservice is widely used by well-known websites such as liputan6.com, vidio.com and BBM (Black Berry Messanger) application

a) Detecting Spam:  /api/comment-check

PARAMETER - HEADER

  • 'key:your_application_key'
  • 'Content-Type: application/json'

PARAMETER - DATA

  • comment_content: Your texts
  • blog: Blog
  • user_ip: IP address of the comment submitter
  • user_agent: User agent string of the web browser submitting the comment - typically the HTTP_USER_AGENT cgi variable
  • referrer: The content of the HTTP_REFERER header should be sent here
  • permalink: The full permanent URL of the entry the comment was submitted to
  • comment_type: A string that describes the type of content being sent
  • comment_author: Name submitted with the comment
  • comment_author_email: Email address submitted with the comment
  • comment_author_url: URL submitted with comment
  • blog_lang: Indicates the language(s) in use on the blog or site, in ISO 639-1 format, comma-separated. A site with articles in English and French might use “en, fr_ca”
  • blog_charset: The character encoding for the form values included in comment_* parameters, such as “UTF-8” or “ISO-8859-1”
  • user_role: The user role of the user who submitted the comment. This is an optional parameter
  • country: Country of comment submitter
  • app_version: App_version used by comment submitter
  • platform: Platform used by comment submitter
  • carrier: carrier
  • device_pin: Device pin or ID of comment submiiter
  • os_version: OS version used by comment submitter
  • device_model: Device model used by comment submitter
  • username: Username of comment submitter
  • avatar_url: Avatar URL of comment submitter

Curl example
curl -v --header 'key:application_key' --header 'Content-Type: application/json' --data '{"comment_content": "spam spam spam", "blog": "https://myblog.me", "platform": "web-desktop"}' "$SOTOY_BASE_URL/api/comment-check"
And the result would be either true, false, or invalid, for example
< HTTP/1.1 200 OK
< Content-Length: 4
< Content-Type: text/html; charset=utf-8
< Date: Tue, 08 Mar 2016 03:21:08 GMT
< Server: waitress
<
true

b) Reporting Single Spam:  /api/submit-spam

PARAMETER - HEADER

  • 'key:your_application_key'
  • 'Content-Type: application/json'

PARAMETER - DATA

  • comment_content: Your text
  • blog: blog
  • user_ip: IP address of the comment submitter
  • user_agent: User agent string of the web browser submitting the comment - typically the HTTP_USER_AGENT cgi variable
  • referrer: The content of the HTTP_REFERER header should be sent here
  • permalink: The full permanent URL of the entry the comment was submitted to
  • comment_type: A string that describes the type of content being sent
  • comment_author: Name submitted with the comment
  • comment_author_email: Email address submitted with the comment
  • comment_author_url: URL submitted with comment
  • blog_lang: Indicates the language(s) in use on the blog or site, in ISO 639-1 format, comma-separated. A site with articles in English and French might use “en, fr_ca”
  • blog_charset: The character encoding for the form values included in comment_* parameters, such as “UTF-8” or “ISO-8859-1”
  • user_role: The user role of the user who submitted the comment. This is an optional parameter
  • country: Country of comment submitter
  • app_version: App_version used by comment submitter
  • platform: Platform used by comment submitter
  • carrier: carrier
  • device_pin: device pin or ID of comment submiiter
  • os_version: OS version used by comment submitter
  • device_model: Device model used by comment submitter
  • username: Username of comment submitter
  • avatar_url: Avatar URL of comment submitter

Curl example
curl -v --header 'key:application_key' --header 'Content-Type: application/json' --data '{"comment_content": "spam spam spam", "blog": "https://myblog.me", "platform": "web-desktop"}' "$SOTOY_BASE_URL/api/submit-spam"
And the result would be
< HTTP/1.1 200 OK
< Content-Length: 41
< Content-Type: text/html; charset=utf-8
< Date: Tue, 08 Mar 2016 03:24:39 GMT
< Server: waitress
<
Thanks for making the web a better place.

c) Reporting Single Ham:  /api/submit-ham

PARAMETER - HEADER

  • 'key:your_application_key'
  • 'Content-Type: application/json'

PARAMETER - DATA

  • comment_content: Your text
  • blog: blog
  • user_ip: IP address of the comment submitter
  • user_agent: User agent string of the web browser submitting the comment - typically the HTTP_USER_AGENT cgi variable
  • referrer: The content of the HTTP_REFERER header should be sent here
  • permalink: The full permanent URL of the entry the comment was submitted to
  • comment_type: A string that describes the type of content being sent
  • comment_author: Name submitted with the comment
  • comment_author_email: Email address submitted with the comment
  • comment_author_url: URL submitted with comment
  • blog_lang: Indicates the language(s) in use on the blog or site, in ISO 639-1 format, comma-separated. A site with articles in English and French might use “en, fr_ca”
  • blog_charset: The character encoding for the form values included in comment_* parameters, such as “UTF-8” or “ISO-8859-1”
  • user_role: The user role of the user who submitted the comment. This is an optional parameter
  • country: Country of comment submitter
  • app_version: App_version used by comment submitter
  • platform: Platform used by comment submitter
  • carrier: carrier
  • device_pin: device pin or ID of comment submiiter
  • os_version: OS version used by comment submitter
  • device_model: Device model used by comment submitter
  • username: Username of comment submitter
  • avatar_url: Avatar URL of comment submitter

Curl example
curl -v --header 'key:application_key' --header 'Content-Type: application/json' --data '{"comment_content": "ham ham ham", "blog": "https://myblog.me", "platform": "web-desktop"}' "$SOTOY_BASE_URL/api/submit-ham"
And the result would be
< HTTP/1.1 200 OK
< Content-Length: 41
< Content-Type: text/html; charset=utf-8
< Date: Tue, 08 Mar 2016 03:24:39 GMT
< Server: waitress
<
Thanks for making the web a better place.

d) Bulk Reporting:  /api/submit-spams-hams

PARAMETER - HEADER

  • 'key:your_application_key'
  • 'Content-Type: application/json'

USAGE

  • '[{"comment_content": "your texts", "flag": "spam|ham"} ...]'

PARAMETER - DATA

  • comment_content: Your text
  • blog: blog
  • user_ip: IP address of the comment submitter
  • user_agent: User agent string of the web browser submitting the comment - typically the HTTP_USER_AGENT cgi variable
  • referrer: The content of the HTTP_REFERER header should be sent here
  • permalink: The full permanent URL of the entry the comment was submitted to
  • comment_type: A string that describes the type of content being sent
  • comment_author: Name submitted with the comment
  • comment_author_email: Email address submitted with the comment
  • comment_author_url: URL submitted with comment
  • blog_lang: Indicates the language(s) in use on the blog or site, in ISO 639-1 format, comma-separated. A site with articles in English and French might use “en, fr_ca”
  • blog_charset: The character encoding for the form values included in comment_* parameters, such as “UTF-8” or “ISO-8859-1”
  • user_role: The user role of the user who submitted the comment. This is an optional parameter
  • country: Country of comment submitter
  • app_version: App_version used by comment submitter
  • platform: Platform used by comment submitter
  • carrier: carrier
  • device_pin: device pin or ID of comment submiiter
  • os_version: OS version used by comment submitter
  • device_model: Device model used by comment submitter
  • username: Username of comment submitter
  • avatar_url: Avatar URL of comment submitter

Curl example #1: one valid sample
curl -v --header 'key:application_key'  --header 'Content-Type: application/json' --data '[{"comment_content": "ini spam", "blog": "https://myblog.me", "platform": "web-desktop", "flag": "spam"}]' "$SOTOY_BASE_URL/api/submit-spams-hams"
And the result would be
preserve:
< HTTP/1.1 200 OK
< Content-Length: 40
< Content-Type: text/html; charset=utf-8
< Date: Tue, 06 Sep 2016 05:02:03 GMT
* Server waitress is not blacklisted
< Server: waitress
<
Success [true]
Curl example #2: Two samples with one invalid flag
curl -v --header 'key:application_key'  --header 'Content-Type: application/json' --data '[{"comment_content": "ini spam", "blog": "https://myblog.me", "platform": "web-desktop", "flag": "spamx"},{"comment_content": "ini spam", "flag": "spam"}]' "$SOTOY_BASE_URL/api/submit-spams-hams"
And the result would be
< HTTP/1.1 200 OK
< Content-Length: 37
< Content-Type: text/html; charset=utf-8
< Date: Tue, 06 Sep 2016 05:02:41 GMT
* Server waitress is not blacklisted
< Server: waitress
<
Success [false, true]
Curl example #3: Two valid samples
curl -v --header 'key:application_key'  --header 'Content-Type: application/json' --data '[{"comment_content": "ini ham", "blog": "https://myblog.me", "platform": "web-desktop", "flag": "ham"},{"comment_content": "ini spam", "flag": "spam"}]' "$SOTOY_BASE_URL/api/submit-spams-hams"
And the result would be
< HTTP/1.1 200 OK
< Content-Length: 40
< Content-Type: text/html; charset=utf-8
< Date: Tue, 06 Sep 2016 05:03:51 GMT
* Server waitress is not blacklisted
< Server: waitress
<
Success [true,true]