Pragmatic and Essential

Key Challenge

Let's explore the real-life activities of computational scientists. To complete a computational assignment, they might run Python code with complex conda environment dependencies, process files using various installed software packages, and submit heavy computational tasks to a remote supercomputer center. Some may document these steps in a file for sharing or future reference. However, this back-and-forth trial process can be tedious and error-prone. While some might attempt to write a Python script to automate the entire procedure, the complexity of scenarios often spirals out of control. Sometimes, a simple automation works adequately, but when new assignments arise, they must rewrite the entire script from scratch. Debugging? That's a real challenge. Different software versions, environment conflicts, remote job failures... It's frustrating.

That's why we built BoCoFlow. It's the practical tool we needed but never had – a solution that understands how computational scientists actually work.

BoCoFlow Workflow

Nodes in Conceptual Level

While any computation can be wrapped as a node, the key is finding the right level of abstraction. We treat nodes at a conceptual level that aligns with how computational scientists naturally think about their work, avoiding unnecessary complexity while maintaining flexibility. Moving nodes to less abstract levels would compromise both performance and conceptual clarity.

Hierarchy of computational tasks

We organize computational tasks into four distinct layers, each building upon the previous to enable increasingly complex scientific workflows.

L-4. Orchestration Layer (Workflow Level)

The highest level where workflows orchestrate multiple applications and processes to achieve complex scientific objectives. This is where researchers design end-to-end pipelines.

Examples: Drug discovery pipelines, Material optimization workflows, Multi-stage data processing

L-3. Application Layer (Node Level)

Complete software packages and applications that serve as the building blocks of workflows. Each node represents a self-contained computational process.

Examples: Molecular dynamics simulations, Quantum chemistry calculations, Data analysis applications

L-2. Assembly Layer (Component Level)

Reusable functions and operations that combine primitive operations into meaningful computational snippets.

Examples: Force field calculations, Correlation calculations, Statistical analysis routines

L-1. Primitive Layer (Base Level)

The foundational layer of basic numerical operations and elementary computations that support all higher-level functions.

Examples: Vector/Matrix operations, Numerical integration, Basic mathematical calculations

Node Abstraction Concept

Dependency Delegation

The node layer bridges workflow design and execution. To maintain its conceptual integrity in design, we choose to delegate hardware dependencies in execution to the user. Rather than forcing users into a rigid environment, BoCoFlow empowers them to configure their own computational setup through executable paths, conda environments, or Docker images, ensuring maximum flexibility while maintaining reproducibility.

Integrated Environment

A pre-configured environment optimized for learning and exploration

Key Benefits:
  • Zero configuration required
  • Verified, tested environment
  • Perfect for learning and prototyping
  • Instant setup and deployment
Learning-Focused

Ideal for tutorials, workshops, and rapid prototyping where setup time should be minimal

Ready-to-Use Stack

Curated selection of tools and libraries, pre-configured for immediate use

Custom Environment

Full control over your computational environment for production use

Key Benefits:
  • Complete environment control
  • Seamless integration with existing tools
  • Optimized for your infrastructure
  • Enhanced security and compliance
Full Configuration

Define your environment using Conda, Docker, or custom configurations

Team Integration

Share and version control environments across your organization

Our Recommendation

While BoCoFlow provides both options, we recommend the Custom Environment approach for production use. The Integrated Environment is perfect for learning and prototyping, but production workflows benefit from the control and optimization possible with custom environments. We provide extensive documentation and templates for both Conda and Docker to help you establish the perfect environment for your needs.

Workflowability

When computational tasks are wrapped as nodes, they gain powerful workflow capabilities that transform them from isolated operations into interconnected, manageable research components. These capabilities emerge naturally from the node architecture, enabling seamless integration into scientific workflows.

Design and Construction

Drag-and-drop Composability

Visually construct workflows by dragging and connecting nodes on a canvas

Example: Combine molecular dynamics simulation with data analysis nodes by simple drag-and-drop

Smart Configurability

Configure node parameters through an intuitive interface with real-time validation

Example: Adjust simulation parameters with type checking and dependency validation

Execution Control

Active Monitoring

Track execution progress, resource usage, and intermediate results in real-time

Example: Monitor convergence metrics during optimization runs

Execution Analytics

Comprehensive debugging suite with execution history, state inspection, and error tracing capabilities

Example: Analyze logs and trace errors in quantum chemistry calculations

Flexible Execution

Skip, rerun, or modify specific nodes without disrupting the entire workflow

Example: Rerun data analysis while keeping simulation results

Computational Traceability

Comprehensive Recording

Record every detail of your computation - from input parameters to execution environment

Example: For a machine learning workflow, automatically record: training data versions, model hyperparameters, GPU/CPU configurations, Python package versions, training timestamps and duration

Enhanced Reproducibility

Complete workflow reproducibility through simple file sharing

Share your entire project with just three components: (1) workflow as a JSON file, (2) custom node implementations as Python files, and (3) execution data and results.

Impact

This transformation from isolated computations to workflow-enabled nodes revolutionizes how research is conducted. By handling operational complexity, the system frees researchers to focus on scientific discovery rather than technical overhead. Manual, error-prone processes become reliable, reusable components that can be shared with the scientific community - accelerating research progress and fostering open science collaboration through reproducible workflows.

Workflow Features

Technical Essentials

A quick peek under the hood - technical details for the curious

Ready to get started?

Download BoCoFlow and start building your first workflow in minutes. Join us in shaping the future of computational workflows with BoCoFlow.

Download BoCoFlow