Can AI replace human jobs? HSBC conducted an experiment.

HSBC pointed out that the key to improving work efficiency lies in the combination of human analysts and AI tools. AI may make mistakes in critical areas, and human confirmation is required at every step.

The momentum of the OpenAI wildfire continues until 2024, and even those who do not use OpenAI are beginning to be labeled as "laggards of the times."

On February 20, HSBC released an experimental analysis report titled "Can OpenAI Replace My Job?" In the report, Mark McDonald, Director of Data Science and Analytics at HSBC, compared the performance of OpenAI's "Advanced Data Analysis" module with that of human analysts, and concluded:

"In this experiment, OpenAI performed very well and is still improving, but it has not yet reached the level of replacing data analysts. The use of OpenAI tends to automate specific tasks rather than completely replace all human responsibilities and work. The productivity level of data analysts has been significantly improved with the use of OpenAI tools."

HSBC stated that during the experiment, they used a publicly available dataset - the Zillow Home Value Index for various states, and had human data scientists and OpenAI conduct exploratory data analysis (EDA) on the dataset.

HSBC believes that the above tasks are challenging for OpenAI tools (such as OpenAI) for the following reasons:

Vague Instructions: The requirements are not particularly clear, and do not specifically indicate which aspects of the dataset need to be analyzed, requiring OpenAI to autonomously decide how to conduct EDA.
Multi-step Analysis Required: EDA analysis is not just about performing a simple task, but requires multiple analysis steps to explore the characteristics and trends of the dataset.
Non-typical Data Format: The format of the dataset is not a common standard format, which adds complexity to processing and analyzing the data.

Looking at the results of the experiment, HSBC wrote that initially they only loaded the dataset into the OpenAI dialog box and asked it to conduct exploratory data analysis (EDA) on the dataset. This attempt usually ended with OpenAI crashing after only a few EDA analyses.

To make the experiment smoother, HSBC found that it is necessary to first list the EDA analysis steps that OpenAI is expected to complete, and then proceed step by step. However, each step requires human confirmation to demonstrate the best performance of OpenAI, as human involvement can better coordinate with OpenAI to efficiently complete tasks correctly.

Comparison between OpenAI Analysts and Human Analysts

In the report, HSBC stated that they uploaded the data file of the Zillow Home Value Index (ZHVI) for various states to OpenAI and asked it to load the data into a pandas DataFrame. Then, they had OpenAI conduct a comprehensive exploratory data analysis (EDA) on the dataset step by step (see the full steps in the appendix). At the same time, HSBC also had human analysts perform corresponding steps to compare the strengths and weaknesses of humans and artificial intelligence in the field of data analysis:

Firstly, in the data processing process, human analysts adopt the method of transposing data rows and columns (dataframe). By using this method, the original dates that served as column names become index values, and the values in the original RegionName column become new column names. The result of this approach is the loss of other metadata columns (such as RegionID, SizeRank, RegionType, and StateName), which are placed in a separate metadata object.

OpenAI's approach is to use the melt function in pandas to transform wide-format data frames into long format. The advantage of the melt method is that all metadata is retained in the same data frame object.

In this example, the metadata is not particularly useful, so both methods are feasible. However, in datasets where other metadata is more important, the human analyst's method may require a lot of subsequent join or merge operations in the analysis, which can be more cumbersome.

At the same time, OpenAI includes a lot of comments in the code-writing process, which helps understand the purpose and function of the code. In contrast, humans often do not want to spend time writing comments during data analysis because it takes up more time.

However, having many comments in the code generated by OpenAI is beneficial for improving code quality and promoting collaboration among teams. Although humans do not like writing comments, they appreciate seeing these comments when looking at other people's code.

In the report, HSBC pointed out that the most impressive thing OpenAI did was to visually display the growth rate of house prices in various states on a map. The following screenshot only shows this visualization, but in reality, it is an interactive HTML/JavaScript map:

This is also a case of how OpenAI effectively collaborates with human analysts. In the example shown below, OpenAI used a Python package called folium to create visual maps, a tool that human analysts had not used before. However, by examining the code generated by OpenAI and the complete working examples, human analysts can quickly learn how to create similar visual effects.

At the same time, there is an issue in the interactive visualization graph generated by OpenAI, where the color coding for missing data is the same as the color coding for representing low growth rates, which can cause confusion. Human analysts have successfully solved this problem by modifying OpenAI's code. The improved visualization is shown below, with missing data states highlighted in blue, making the visual information clearer and easier to understand.

HSBC pointed out that when OpenAI conducted correlation analysis, it made a mistake that an untrained data analyst would make - calculating correlation on non-stationary data, choosing to analyze based on price levels rather than percentage changes in prices:

Although OpenAI is familiar with econometric literature and can suggest applying ARIMA models to data, it still made a mistake in calculating correlation on non-stationary data. This also demonstrates the differences between OpenAI and humans in handling knowledge.

Once humans receive good training in econometrics, they usually do not make such mistakes again, while OpenAI, despite knowing the relevant theory, may still make mistakes in practical applications. When using OpenAI for data analysis, human experts' supervision is still needed to avoid drawing incorrect or dangerous conclusions.

We once again request OpenAI to use price MoM percentage changes instead of prices themselves for analysis. The results of this analysis can demonstrate the importance of non-stationary data.

When conducting correlation assessments using non-stationary data (i.e., price levels), OpenAI incorrectly assessed the correlation coefficient between Texas and Hawaii (as high as 94%). After the analysis method was corrected, the correlation coefficient between these two states dropped to 58%.

HSBC mentioned that in the final stage of housing price prediction, OpenAI chose to use an ARIMA model to predict California house prices. However, there were doubts about OpenAI's choice of model parameters, as it arbitrarily chose an ARIMA(5,1,0) model for fitting without providing a reason for choosing this model.

Another key issue when using OpenAI is that over time, OpenAI has forgotten its initial planned EDA steps, so humans have to remind OpenAI to proceed as planned for the task to be completed.

Below is the breakdown of the analysis steps that HSBC asked OpenAI to complete:

Data Overview: Display the first and last few rows of the dataset. Check the data types and count of non-null values in each column. Obtain basic data summaries for numerical columns.
Handling Missing Values: Identify columns with missing values. Adopt appropriate strategies to address these issues, such as deleting empty data or assigning values to empty cells.
Time Analysis: Plot the overall trend of housing prices. Identify periodic or cyclical trends. Highlight outliers or exceptional events.
Regional Analysis: Identify the states with the highest and lowest average house prices. Analyze the growth rates of each state to find the markets with the fastest growth and decline. If possible, visualize the data on a map to discover the regional distribution.
Distribution Analysis: Create histograms or kernel density estimation plots to understand the distribution of house prices. Use box plots to identify outliers and compare the distributions between different states.
Correlation Analysis: Calculate the pairwise correlations of house prices between different states to identify relationships. Visualize the correlations using a heatmap.
Decomposition: If the dataset shows clear trends or seasonality, perform time series decomposition to separate trends, seasonality, and residuals.
Frequency Analysis: Analyze the frequency of significant increases or decreases in house prices. Identify specific months or seasons with peaks or troughs.
Statistical Testing: Conduct appropriate statistical tests based on the questions or hypotheses. For example, if you want to determine if the price difference between two states is statistically significant, you can use a t-test.
Feature Engineering (if planning for modeling): Create lag features, moving averages, and other derived features that may be useful for predictive modeling.
Insights and Documentation: Record all important findings and insights during the exploratory data analysis process. This is useful for subsequent decision-making or result presentation.
Visualization: Utilize various visualization tools and techniques to represent data in an intuitive and insightful manner, including line charts, bar graphs, scatter plots, heatmaps, etc.
Final Report: Summarize key analysis results and provide actionable recommendations or suggestions based on the analysis.