Thursday, September 25, 2014

Results from New Methods shows Improvement

The algorithm ran over 26 events, with 33% test data (of whole). The estimated magnitudes were plotted against actual magnitude, and following plot was obtained:


F:\SURF14\Output\MagnitudePreds\Magnitude Pred labeled.png

Moreover, subtracting estimated value from actual value for each and every sensor to find error, and running a Kernel Density Estimation gives following curve:
F:\SURF14\Output\Magnitude-KDE\Newhall 2_cisn.jpg

This shows that noise within these sensors can be approximate with a Gaussian White Noise, which can be used to further enhance the model.

Band Segmentation Trick

Band Segmentation and Median

Suppose we have N sensors for an event in 50 Km radius.  We segment all the sensors in five equal bands based on distance from hypocenter. This is shown in the figure below:

Now for each band, we take the sensor with median value of acceleration and its distance from the hypocenter. This central measure is representative of all sensors in that band. We get a total of five values for each event. We find coefficients from training events, and then test using test data set. Thus, for any event, we get total of five estimations, let’s call them mi. As in the case actual richer scale, the ML is the mean of several seismograms; our prediction is also mean of these five readings:

ML= sum ( mi) / 5
We plot these estimated predictions vs. actual magnitudes to see the performance of our algorithm.

Magnitude Estimation: Trial 1

The target was set to estimate magnitude using 10 seconds of data after the event occurred. Since in 10 seconds, wave can only travel 50 km, it made sense to consider only stations within 50 KM radius of the epicenter.

Just like a pick, it doesn’t make much sense to consider each sensor as a Seismogram and calculate Richter Scale magnitude from it, since each and every sensor will have different bias and noise levels, and each sensor in array would give different reading for same earthquake. This is proven by the plot below:
Not quite wanted ? Right ?

Notice that the estimations are way off the mark and are all over the map, while we expected them to along y = x line.

Instead we have advantage of numbers; we should look at the measures of central tendency.

The Dataset for Earthquake magnitude Estimation


The dataset for these experiments is same as dataset for Hypocenter estimation experiments, i.e. roughly 11,000 streams of acceleration data spanning across 26 events. Data has labeled distance from epicenter and magnitude of earthquake for the station with which it was recorded. The magnitude was verified from USGS Catalog.



First step was to divide the total data into training and test with respect to events. Out of total 26 events for which data was available 66% were used to estimate coefficients, and the equation was tested for remaining test data.

New Chapter: Earthquake Magnitude Estimation


Detecting an earthquake is just not enough for early warning. We need to assess severity of warning too, before we disseminate the warning. Severity of earthquake is estimated through Richter scale conventionally. Modern techniques use moment magnitude scale. For our experiments, we wish to predict magnitude of the earthquake as soon we make estimation about hypocenter using Richter Scale.

The Richter magnitude of an earthquake is determined from the logarithm of the amplitude of waves recorded by seismographs.  The original formula is for Magnitude ML is:
M_\mathrm{L} = \log_{10} A - \log_{10} A_\mathrm{0}(\delta) = \log_{10} [A / A_\mathrm{0}(\delta)],\
where A is the maximum excursion of the Wood-Anderson seismograph, the empirical function A0 depends only on the distance of the station from epicenter, \delta. In practice, readings from all observing stations are averaged after adjustment with station-specific corrections to obtain the M_\text{L} value. Reader should see the chart below to a feel of what does Richter magnitude translates to in terms of shaking:
http://worldonline.media.clients.ellingtoncms.com/img/photos/2008/05/01/GRA-Kansas_Fault_lines_Version_3_t625.jpg

However, CSN sensors are not as quite accurate as seismogram when it comes to detecting absolute value of acceleration. However, with quantity and ample data, we can make corrections, find correct parameters to estimate Richter scale merely from cost effective accelerometers, that too within seconds of event happening.

We assume following equation for our magnitude estimation:

M(A, D)= α log10 A  +   β log10D

Where A is the maximum magnitude of acceleration recorded by the sensor in the T = 10 second period, and D is the distance from estimated hypocenter. For the experiment only, we assume true hypocenter to estimated hypocenter.

Results for Hypocenter Estimation


The latest version of algorithm was run for 26 earthquakes, and results were simulated as if in real time. The key metrics used for the benchmark were:
  1. How quickly the hypocenter is being estimated?
  2. Offset between estimated and true hypocenter

Offset distance vs. time after earthquake was plotted for all the events, and most of them looked like these:

Note that dive in offset distance starts within seconds of earthquake occurring. This validates the utility of the Brute Force Hypocenter estimation model. The table below summarizes these results for few events for CSN sensors:

Name
Magnitude
Offset (km)
Time After event (s)
La Canada
1.7
48.0360954
1
Fontana 3
4.4
10.9443251
5
Westwood
4.4
17.2685902
6
La Habra
5.1
17.5934921
7
Average
3.9
23.47
4.75

And another summarizes results for CISN Sensors:
Name
Magnitude
Offset (km)
Time After event (s)
Altadena
2.1
18.6
17
Brawley
5.5
2.7
11
Beverly Hills
3.2
14.3
12
Newhall 2
3.9
5.4
11
Lennox
2.4
8.8
12
Manhattan Beach
3.3
5.4
11
San Fernando
2.4
5.4
13
Monterey Park
1.7
84.9
13
North Hollywood 2
2.6
10.9
12
Rancho Cucamonga
1.9
12.0
11
Anza
4.7
2.0
12
Anza 2
3.4
23.9
12
Fontana 2
2
6.1
11
Los Angeles
2.8
6.6
11
Marina del Rey
3.2
3.6
10
Marina del Rey 2
2.8
12.9
11
Meiners Oaks
3.6
8.6
13
Santa Barbara Channel
4.9
47.6
15
Rancho Palos Verdes
2.9
6.2
14
Weldon
4.3
17.4
14
Weldon 2
4.2
16.6
18
La Verne
3.7
27.1
11
Joshua Tree
4.2
19.3
11
Fontana 3
4.4
9.5
9
Westwood
4.4
5.7
10
La Habra
5.1
15.3
10
Average
3.44
15.3
12.1

For these 26 events (CISN, see appendix) we see that we can estimate hypocenter with average offset of 15.3 Km with on an average 12.1 seconds after the earthquake. This is very close to 10 seconds that we aimed at.

For CSN sensors, the same model was able estimate hypocenters with average offset of 24km and on an average 5 seconds after the event. This was expected since CSN sensors are denser, hence better timing. On the other hand, CISN sensors are less dense, but have superior quality, hence better offset.