To buy, hold, or sell?

Image for post
Image for post
Photo by vipul uthaiah on Unsplash

Here we are in 2021, wishing this year to be better than the odd and isolating year that just went by. We started the year by getting more people to get their Covid-19 vaccines. The outlook of going back to normal routine seems a little more promising. But just before the first month of a hopeful year ended, one particular stock, a dying brand, shook up Wall Street and became a ‘meme-stock.’

Gamestop (NYSE: GME), a brick-and-mortar gaming retailer, was in the spotlight in the American stock market frenzy. The retail chain has seen its…


Borough, street, cuisine type to avoid in the city

Image for post
Image for post
Photo by Anton on Unsplash

New York City, one of the best dining destinations in the world. In this city, you can easily find thousands and thousands of restaurants for your foodie adventure. From Michelin restaurants to street food, the concrete jungle has them all.

But underneath the glamor and glitz, (and the fancy) restaurants, do people really know what goes behind the scene?

To find out more, I looked into the NYC restaurant inspection results to see if I could find anything interesting. You can find out more about the dataset here on NYC open data.

About the dataset

The dataset consists of almost 400k restaurant violations…


A summary after reading many StackOverflow posts, now with codes!

Image for post
Image for post
image by author: A retro chart generated in R.

Uncovering the company’s business plan throughout the Covid -19 pandemic with Data Science

Image for post
Image for post
Photo by zhang kaiyv on Unsplash

This article is part of an NLP series where I use text mining techniques to analyze earnings calls.

In today’s article, I will be analyzing Apple Inc’s earnings call in Financial Year 2020 with keyword extraction and frequency analysis techniques in R.

Preliminary dataset exploration and cleaning

Earnings call transcripts from Quarter 1 to 4 of Financial Year 2020 released by Apple Inc were used for analysis. After obtaining the dataset, I used Microsoft Excel and RPA tools to pre-process it.


Looking at the debate with Data Science

Image for post
Image for post
Photo by Charles Deluvio on Unsplash

Thanks to the internet, now the world knew about the Presidential Debate 2020 that went out of control. All of the major news stations were reporting about how the participants were interrupting and sniping at one another.

I decided to put together an article that focuses on analyzing the words used in the event and see if there are any hidden insights.

This article focuses on finding out the most used words, categorized by each spokesperson, and sentiment analysis of the speeches.


Detecting fraud from the text of Enron’s earnings call

Image for post
Image for post
Photo by Joshua Hoehne on Unsplash

Natural Language Processing (NLP) has been gaining tractions in recent years, allowing us to understand unstructured text data in a way that was never possible before. One of the promises of NLP is to use relevant techniques to detect fraud in companies and shed light on potential violations in the early phase.

About the dataset

I’ve only managed to find two earnings call transcripts online. And only one of
them is readable when converted from PDF to text. You can find the original
document here.

The earnings call transcript used in this article is from Enron’s conference call hold on November 14, 2001…


This article uses Natural Language Processing techniques to detect fraud in the accounting scandal-plagued Chinese coffee chain.

Image for post
Image for post
Photo by Ashkan Forouzani on Unsplash

In early 2020, Luckin Coffee was delisted from the NASDAQ stock exchange after the CEO admitted to inflating accounting figures in the company’s 2019 earning reports.

Luckin Coffee, once acclaimed as Starbucks’ biggest rival in the Chinese coffee market, was charged with fabricating sales revenues in 2019. Even though the scandal took some time to blew up, it inspired me to start thinking about the possibility of detecting fraud through words.

This article focuses on applying Natural Language Processing techniques to the Luckin Coffee Earning Calls in Quarter 2 and 3 in 2019. …


A step-by-step guide to cleaning and analyzing tweets in R

Image for post
Image for post
Photo by Darren Halstead on Unsplash

As we get closer to the U.S.’s next presidential election, I wanted to know what people are thinking of the nominees. Will, the current President, continue his stay at the White House, or will we see a new U.S. President with a less angry Twitter rant?

Getting the dataset

I used the R package rtweet to download tweets with the hashtag #WhenTrumpIsOutOfOffice tweeted in March 2020. As a result, I was able to find more than 6000 tweets with the hashtag.

library(rtweet)# create token named "twitter_token" twitter_token <- create_token( app = appname, consumer_key = consumer_key, consumer_secret = consumer_secret, access_token = access_token, access_secret…


Uncover Netflix’s expansion strategy in India against Disney Plus with popular NLP and text mining techniques.

Image for post
Image for post
Photo by Thibault Penin on Unsplash

Earning calls are conference calls between the Chief Executives of a company with the public. The discussion sessions provide opportunities for the public to get a glimpse into what is happening in the company and within the industry.

In the conferences, investors tend to interpret the Chief Executives’ language use with the company’s future performance. If you ever listened to an earning call, you may have already notice that executives are cautious with the words they use in these conferences.

Natural Language Processing (NLP) has been gaining popularity in these recent years. The intersection between technology and the linguistic field…


Understanding the billionaire’s words and thoughts through Data Science.

Image for post
Image for post
Photo by Sharon McCutcheon on Unsplash

Keyword extraction is one of the most popular text mining techniques in the Natural Language Processing (NLP) field. The idea behind keyword extraction is to capture important words using Data Science automatically. The technique is very effective when we want to gain insights from a big chunk of text data quickly.

In this article, I will attempt to apply keyword extraction techniques on the stakeholder letters penned by Warren Buffett between 1977 to 2019. There are many keyword extraction techniques available, but we will focus on using three techniques: frequency analysis, RAKE, and POS-tagging on the letter texts.

Warren Buffett…

fylim

Changing the world with data points, one word at a time. #naturalLanguageProcessing #textMining

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store