Input data semantics and dimensions for seismic data

Hey Liu 

I really like your work. It's a great contribution to improving AI capabilities in the seismology domain.

I'm trying to reproduce this work in the petroleum seismic acquisition domain where a single component trace amplitude is recorded. 
I've attached a picture of embeddings I obtained using the following pipeline. 

```mermaid
flowchart LR
    trace --> bandpass_filtering
    trace --> resampling 

    resampling & bandpass_filtering --> detrending --> amplitude_to_velocity_conversion

    amplitude_to_velocity_conversion --> hilbert_space_estimation --> n_component 
    hilbert_space_estimation --> e_component 

    
    n_component & e_component & trace --> trace_3c
    trace_3c --> standardisation

    m1[("`wav2vec 
    random_init`")]

    m2[("`wav2vec 
    pretrained`")]

    standardisation --> m1 
    standardisation --> m2 

    t1((t-sne))

    m1 -->  t1 --> embeddings_wav2vec_random_init
    m2 --> t1 --> embeddings_wav2vec_pretrained

    detrending --> standardisation2 -->|detrended trace|t1 --> embeddings_trace

    embeddings_wav2vec_random_init & embeddings_wav2vec_pretrained & embeddings_trace --> v1["`
    **Visual Comparison**
    good vs dead
    good vs reverse
    good vs noise
    good vs powerline
    all classes
    `"]

    class_labels --> v1
```

![Image](https://github.com/user-attachments/assets/023d2da6-24c4-4e8a-8a22-2a6c87be50e6). The colors indicate different classes of seismic trace contaminations.

Could you tell me if I need to convert the amplitude information per trace to velocity / acceleration before I extract features using the pre-trained SeisLM ? 

Also, could you confirm if your input to conv encoder during pre-training & downstream tasks, had 12000 samples per trace ( 120s signal sampled at 100 hz ) ? Or was it randomly sampled 3001 samples per trace as shown in the pretraining example notebook?

Have you analysed the effects of change in embedding distribution with input length L?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input data semantics and dimensions for seismic data #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Input data semantics and dimensions for seismic data #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions