-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Hey Liu
I really like your work. It's a great contribution to improving AI capabilities in the seismology domain.
I'm trying to reproduce this work in the petroleum seismic acquisition domain where a single component trace amplitude is recorded.
I've attached a picture of embeddings I obtained using the following pipeline.
flowchart LR
trace --> bandpass_filtering
trace --> resampling
resampling & bandpass_filtering --> detrending --> amplitude_to_velocity_conversion
amplitude_to_velocity_conversion --> hilbert_space_estimation --> n_component
hilbert_space_estimation --> e_component
n_component & e_component & trace --> trace_3c
trace_3c --> standardisation
m1[("`wav2vec
random_init`")]
m2[("`wav2vec
pretrained`")]
standardisation --> m1
standardisation --> m2
t1((t-sne))
m1 --> t1 --> embeddings_wav2vec_random_init
m2 --> t1 --> embeddings_wav2vec_pretrained
detrending --> standardisation2 -->|detrended trace|t1 --> embeddings_trace
embeddings_wav2vec_random_init & embeddings_wav2vec_pretrained & embeddings_trace --> v1["`
**Visual Comparison**
good vs dead
good vs reverse
good vs noise
good vs powerline
all classes
`"]
class_labels --> v1
. The colors indicate different classes of seismic trace contaminations.
Could you tell me if I need to convert the amplitude information per trace to velocity / acceleration before I extract features using the pre-trained SeisLM ?
Also, could you confirm if your input to conv encoder during pre-training & downstream tasks, had 12000 samples per trace ( 120s signal sampled at 100 hz ) ? Or was it randomly sampled 3001 samples per trace as shown in the pretraining example notebook?
Have you analysed the effects of change in embedding distribution with input length L?