EEG Signal Classification of Motor Imagery Right and Left Hand using Common Spatial Pattern and Multilayer Perceptron Back Propagation

— The number of people with disabilities is increasing, so it requires bionic devices to replace human motor functions. Brain-Computer Interface (BCI) can be a tool for the bionic device to communicate with the brain. Signal brain or Electroencephalogram (EEG) signal need to classify to drive the corresponding bionic device. This research goal is to classify the imagination of the right and left-hand movements based on the EEG signal. The system design in this research consists of EEG channel selection using Finite Impulse Response (FIR) filter, feature extraction using Common Spatial Pattern (CSP), and classification using Multilayer Perceptron Back Propagation (MLP-BP). The data used a secondary dataset from BCI Competition IV (2b) with 9 research subjects. The research scenario is carried out by trying to use several variations in the number of hidden layer nodes on each EEG channel. Based on the test, the best accuracy for MLP-BP is 68.7% using 24 nodes in the alpha channel.


INTRODUCTION
Over the last few years, the Brain-Computer Interface (BCI) has become a technology that has caught the attention of researchers. This is because BCI is a technology that can immediately read and translate thoughts in the brain and then can be applied to computer commands [1]. This activity utilizes the signal called an Electroencephalogram (EEG) so that it can be converted into commands to electronic equipment with the help of BCI [2]. Several types of EEG signals have been widely used in BCI systems, one of which is the motor imagery. Motor imagery can be defined as a dynamic state in which a person simulates a physical action without actually doing. Therefore, developing a BCI system that accurately distinguishes between different motor imagery signals is essential in improving nerve-based hand control. So that it can be used as research to develop bionic hand tools for those who suffer from amputation [3]. Previous research revealed that the majority of existing motor imagery studies focused on developing EEG-based BCI system that can distinguish five imagery motors such as left hand, right hand, left foot, right foot, and tongue [3].
The recognition procedures include feature extraction and pattern classification, which in feature extraction plays an important role in the classification of EEG signals [4]. Common Spatial Pattern (CSP) is a well-known method among motor imagery studies for feature extraction. CSP is successfully applied in many motor imagery task recognition studies. Significant channels are selected by looking for the maximum spatial vector pattern in scalp mapping [5]. Ping Wang 2018 [6] presents a new framework for EEG data in motor imagery tasks with the CSP feature extraction method. The accuracy of using the CSP extraction feature is 74%. In addition, there are many different methods to be studied for classifying EEG signals in the development of pattern classification. Modern classifications that are widely used include Support Vector Machine (SVM) and Multilayer Backpropagation (MLP-BP) [7]. Both algorithms have their respective advantages in the classification of motor imagery. SVM has an advantage in terms of classification speed while MLP-BP has an advantage in terms of classification accuracy [

A. EEG Dataset
Graz dataset B is obtained from BCI Competition IV [8]. This data contains recorded EEG signals in which there is an imaginary movement of the right and left hands. The dataset of 9 different human brain recording subjects use 3 bipolar records (C3, Cz, and C4) with a sampling frequency of 250 Hz. The footage has a dynamic range of ± 100μV for screening and ± 50μV for session feedback. Each day consisted of six sessions with ten trials in each of the two imaging classes. This results in 120 trials per session. This dataset has several types of events, but the ones used in this research were 769 on the left hand and 770 on the right hand.

B. System Design
The hand and foot movement classification system based on EEG signals is carried out in several continuous stages. The initial stage of the filtering process is to condition the signal by passing a certain channel. The filtered signal is then processed to produce signal features. Feature extraction uses the Common Spatial Patterns method, then signal features are used in the training and testing process (this process uses the multilayer perceptron backpropagation method). Fig. 1 show the system used in this research.

C. Finite Impulse Response
Finite Impulse Response (FIR) is a type of digital filter used in Digital Signal Processing (DSP) applications. This filter has several advantages over the IIR filter type. The FIR filter is always stable, realizable, and provides linear phase response under certain conditions. It is always stable for highquality data transmission and audio systems because it is implemented using an all-zero transfer function. FIR have none of the poles falls outside the circle of the unit so the filter will always be stable. However, the main disadvantage of FIR filters is that the number of coefficients required to implement a particular filter is larger than IIR filter [9]. FIR is also a filter that has a deep polynomial transfer function −1 . In general, the transfer function of a discrete-time system is defined as: where is the order of the polynomial numerator, is the i-th coefficient of the polynomial numerator. is the order of the polynomial denominator, and is the i-th coefficient of the polynomial denominator [10]. The FIR filter does not have a recursive part of the transfer function, so the FIR filter transfer function is: The FIR filter design process through the window function is carried out by determining the sampling frequency, the cutoff frequency, and the filter order. In this research, the Hamming window was used as an experiment. The form of the Hamming mode coefficient is [ ].
Next, calculate the sample response units. The values for the bandpass filter are described as ℎ [ ].
The value of the variable ranges from 0 to , in which is the order of the filter. The filter response results will be influenced by the value of . Then, the formula for the ideal FIR filter design fits the window function:

D. Common Spatial Pattern
Common Spatial Pattern (CSP) is an excellent feature extraction method. It produces a set of spatial filters that can be used to define multidimensional data into a set of correlated components [11]. The basic principle of CSP is to discover an optimal spatial filter to maximize the ratio of the meanvariance belonging to two different classes [1], [12]. Each EEG data has the number of channels/electrodes (N) and the number of samples per channel (T), resulting in data of size N × T. Furthermore, to determine the spatial covariance matrix for the two classes through equation (6).
, is the covariance matrix for the hand motor movement EEG signal , . And ( ) is the sum of the diagonal elements X.
( ) is used to normalize the matrix , . The two matrix classes are then added to form a matrix .
If the two spatial covariances are combined and then the matrix C is factored, it will produce =̂, where is the eigenvector matrix and ̂ is the diagonal matrix. Furthermore, the whitening transformation matrix , namely: The matrix will equalize the variance in the space spanned by the eigenvector in . The matrix is applied to the mean of the covariance of the matrix and [10]. Then the covariance matrix and can be transformed into the following formula along with the eigenvector of each covariance, Then, the spatial filter can be obtained as, = (11) Selected as feature vectors of left-hand and right-hand motor signals, defined as

E. Multilayer Perceptron Backpropagation
The multilayer perceptron is the most well-known and most popular neural network among all existing neural network paradigms. The MLP is a variant of the original perceptron model proposed by Rosenblatt in the 1950s. Multilayer Perceptron (MLP), is also known as a feed-forward neural network [13]. It has several layers. There are generally three layers: input, hidden, and output layers. The input layer accepts input (without performing any operations), then the input values (without being passed to the activation function) are assigned to hidden units. In hidden units, the input is processed and the results of the activation function are calculated for each neuron, then the results are given to the next layer.
The Backpropagation algorithm will send the error from the output layer back to the hidden layer. This error will be related to the cost function of the multilayer perceptron. The gradient value of the cost function will be generated in the learning process [10]. In the back-propagation algorithm, the error output starts from the output layer and moves backward to the hidden layer until the input layer. This process is called backpropagation, the only difference is that the input and output signals flow in opposite directions. The backpropagation algorithm image can be seen in Fig. 2. Fig. 2. MLP-BP algorithm process [14] In the complete formula, the MLP-BP algorithm is as follows: Feed Forward, the MLP (feed-forward) process can be formulated using the following equation (14)-(17).
Where is input, is weight, is the output from the hidden layer, is the output layer and is the result of classification. Output error, at the output of the neuron an error will be found, to find this error can be formulated using equations (18) and (19).
In the equation above, ′ is a derivative of the activation function from each output layer, which is the output of each output layer, is the original or true label from the training data, and is the classification output value of each layer. Meanwhile, is the error value of each classification result, and is the output of each layer after activation. Backpropagate the error, the weight update process is carried out by combining the 2 errors in the transpose matrix.

F. Confusion Matrix
Performance calculation of a model is needed to in the classification process. A confusion matrix compares the predicted value with the actual value and creates a measure of misclassification [14]. Confusion matrix can be seen in Table  I  Accuracy of model classification can be find in the following equation: III.

RESULT AND DISCUSSION
The training and test samples used were EEG signals in the form of matrices in the raw matrix. The data consisted of some of the data used in this research. The sample data to be tested has length of 469011 samples. The dataset uses a sample rate of 250 Hz with different amplitude lengths. The sample data used were 1000 samples generated by multiplying the sample frequency times the record time. The data presentation process is only presented in the 0-100 sample range to facilitate analysis such as differences in amplitude values. Fig. 3a has an amplitude range from -9.16 µV to 10.18 µV, while Fig. 3b has an amplitude range from -8.23 µV to 9.87 µV from 3 bipolar electrodes. Filtering using the FIR method is used to take motion sample data divided into 5 waves, according to the type of EEG waves namely Delta, Theta, Alpha, Beta, and Gamma. The cutoff frequency for each band can be seen in Table II.   TABLE II.

No.
Wave Fc1 (rad / s) Fc2 (rad / s) 1 Delta 0.004 π 0.032 π 2 Theta 0.032 π 0.064 π 3 Alpha 0.064 π 0.096 π 4 Beta 0.096 π 0.28 π 5 Gamma 0.28 π 0.8 π The filter specification uses the firwin method with the Hamming window. So that the amplitude value in the filtering results will not be much different, especially in the Theta, Alfa, and Beta channels. This is because the 3 channels are filtered using Band Pass Filter (BPF). The value of the 3 channels will look significantly different from the Delta channel using Low Pass Filter (LPF) and Gamma with High Pass Filter (HPF). After filtering, the signal has a decrease in amplitude due to being affected by the passband value. Fig. 4a has an amplitude range from -4.76 µV to 4.76 µV, while Fig. 4b  Feature extraction functions as a data size reduction. The sample data will be trained to see their characteristics. The dataset used has 2 classes so that the CSP value setting is m = 1 so that the number of spatial patterns is 2m = 2. The spatial pattern is a topographic image of the covariance value on each motor cortex electrode (C3, Cz, C4) which can be seen in Fig.  5. Fig. 5. Spatial pattern of EEG signals CSP0 and CSP1 are two spatial patterns that have a matrix projection value in each class. CSP0 has a dominant red colour on the right side of the head which indicates a positive or high covariance value, while CSP1 has a dominant blue colour on the left side of the head which indicates negative or low covariance. Both values have a range of -1.6 to +1.6, which is then calculated the variance to be used as a feature to enter the MLP-BP classification.
The sample data where the feature is identified will be classified. Classification is needed to categorize sample data where each sample data has a very large amount of data. The data will be randomized by cross-validation and tested with the number k = 10. The quality of the validation performance includes the accuracy of 5 types of EEG waves namely Delta, Theta, Alpha, Beta, and Gamma. The quality of this performance is used to determine the performance of systems made with 2 classes with variations of 8 hidden layer nodes, 16 hidden layer nodes, and 24 hidden layer nodes, while the number of input layer nodes is as much as the CSP output feature. Based on accuracy result in Table III, the best accuracy results from the MLP-BP classification are found in the Alfa channel with an accuracy of 68.7% in scenarios of 24 hidden layer nodes. Based on the simulation, we gained some knowledge from the results of this research. In general, it can be said that the EEG signal that occurs due to motor movement and imagination (desire) to perform motor movements is in the Alpha (8)(9)(10)(11)(12)(13) and Beta (below 20 Hz) ranges as expressed in Yuliantoro's research [15] that EEG signals in the murhythm (7 -12 Hz) and Beta (above 13 Hz) frequency ranges generated from the center of the brain are closely related to commands for motor imagery. Changes in the variation in the number of nodes also did not significantly affect accuracy but the best accuracy is always between alpha and beta channels. For better accuracy, research that focuses on beta channels can be carried out.

IV. CONCLUSION
The classification system is determined by the cut-off frequency for channel division, then the value on the CSP, and the determination of the number of nodes that are the best in terms of accuracy performance. Thus, the highest average accuracy of the 10 K-Fold experiments in the 24 nodes scenario is 68.7%. The alpha channel is the best channel of all EEG channels, which is the channel consistently providing the highest accuracy from various hidden node variations in the research. The use of a variable number of nodes affects the level of accuracy generated on several channels. This is because it is influenced by the frequency band used in filtering. The best frequency band for research occurs in the range 8-20 Hz in Alpha and Beta channels.