Workflow class in pyiron_contrib as basis for ironflow

**Summary**

The functionality of @liamhuber workflow class in pyiron_contrib could be a good starting point for interesting extensions and merging pyiron_base and ironflow. 

@liamhuber, I had only now a chance to install your workflow demo class and play with it. It is really great work. Even though it is presently only a demo I see a lot of potential. I tried in the attached Jupyter Notebook to briefly summarize and sketch some of my ideas. It is only a very preliminary sketch and it would be good to talk about it. The notebook contains many ideas and suggestions. Some of them are probably straightforward to implement, others may be hard or impossible. It is more meant as a collection of ideas rather than a list of tasks.

# Node-based pyiron concept

Based on the workflow example implemented in pyiron_contrib by Liam a few suggestions/ideas.


```python
%config IPCompleter.greedy=True

import sys
sys.path.insert(1, '..')

sys.path;

import matplotlib.pylab as plt
```


```python
from pyiron_contrib.workflow.node import Node, node
from pyiron_contrib.workflow.workflow import Workflow
from pyiron_contrib.workflow import nodes
```

Make Workflow the next generation pyiron Project object


```python
wf = Workflow('my_first_workflow')
```

## Design concepts

A node with a single output should view this output directly at object level 
- modify \_\_repr__ etc.
- allow short notation of link
- see example below


```python
wf.structure = nodes.bulk_structure(repeat=3, cubic=True, element="Al")
```

This node object should behave as similarly as possible to the output object, i.e., its representation should be the same


```python
wf.structure.outputs.structure.value;
wf.structure; # has same representation (__str__, __repr__, etc.)
```


```python
It should be also shorter to access the output object
```


```python
wf.structure.value.plot3d()
wf.structure.plot3d()  # nice, but may be too much overloading
```

To set a node link the node reference should be sufficient
- wf.calc = nodes.calc_md(job=wf.engine.outputs.job) can be replaced by 
- wf.calc = nodes.calc_md(job=wf.engine)

The original code block can then be simplified:


```python
wf.structure = nodes.bulk_structure(repeat=3, cubic=True, element="Al")
wf.engine = nodes.lammps(structure=wf.structure.outputs.structure)
wf.calc = nodes.calc_md(job=wf.engine.outputs.job, 
                        update_on_instantiation=False, 
                        # run_automatically=False
                       )
wf.plot = nodes.scatter(
    x=wf.calc.outputs.steps, 
    y=wf.calc.outputs.temperature
)
```

#### New version (pseudocode)


```python
wf = Workflow('my_first_workflow', domain='pyiron.atomistic') # set default domain for nodes

structure = wf.create.structure.bulk(repeat=3, cubic=True, element="Al")  # adds node to wf, structure.wf stores wf
engine = wf.create.engines.lammps(structure=structure.outputs.structure)
calc = wf.create.calc_md(job=engine, 
                        update_on_instantiation=False  # should be set as default in node definition for longer jobs
                       )
wf.create.plot.scatter(
    x=wf.calc.outputs.steps,      # more than one output, explicit statement required
    y=wf.calc.outputs.temperature
)
```

#### Execute workflow


```python
from pyiron_contrib.workflow import Executor

exe = Executor(server='my_server', cores=24, queue='my_queue')  # taskmanager
exe.list_servers(), exe.task_table(), ...  # provide utility functions
```

#### Utility functions


```python
wf.run()    # corresponds to update in the original version (modal)
wf.submit(executor=exe) # run in background, submit to queue, etc., requires de-(serialization)
```


```python
wf.status # finished, aborted, unfinished nodes etc.
wf.task_table() # filtered for specific workflow
```


```python
wf.visualize() # show (static) graph network
wf.ironflow()  # show workflow as graph in ironflow 
```


```python
wf.save()  # save also local nodes (provide all info to run on any machine)
wf.load()
```


```python
wf.__repr__() # show output of final node (in the above case the matplotlib graph), 
              # maybe with 'signature' of the workflow (name, status, etc.)
```

#### Create _new_ workflow with same workflow but modified parameters


```python
wf(structure=wf.create.structure.bulk(repeat=3, cubic=True, element="Cu"), name='wf_Cu')
```

#### Extend data objects

In nodes such as calc_MD the number of output objects/values is rather large. It would be therefore highly attractive to replace them by data objects, which could provide additional functionality (e.g. ontology). The decorator should automatically translate the object to provide all fields. Alternatively, we could use ontolgy to connect individual elements (e.g. positions) with such a data object (ontology knows that *positions* is part of *MD_data*.


```python
from dataclasses import dataclass
import numpy as np

@dataclass
class MD_data:
    positions: list | np.ndarray = None
    temperature:int = 0 
    # ...
```


```python
out = MD_data
out.temperature
out.positions = np.linspace(0, 100, 10)
out.positions.shape
```




    (10,)



### Extend input and output

#### File objects

Make sure that files are correctly transfered, copied, etc.


```python
file_out_1 = node1(file_in)
file_out_2 = node2(file_out_1)  
```

### Storage

#### Include (HDF5) storage in workflow object

When node gets output, put (filtered) data in workflow data store. Should replace our present HDF5 storage.






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow class in pyiron_contrib as basis for ironflow #194

Node-based pyiron concept

Design concepts

New version (pseudocode)

Execute workflow

Utility functions

Create new workflow with same workflow but modified parameters

Extend data objects

Extend input and output

File objects

Storage

Include (HDF5) storage in workflow object

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Workflow class in pyiron_contrib as basis for ironflow #194

Description

Node-based pyiron concept

Design concepts

New version (pseudocode)

Execute workflow

Utility functions

Create new workflow with same workflow but modified parameters

Extend data objects

Extend input and output

File objects

Storage

Include (HDF5) storage in workflow object

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions