Philosophy Behind BoCoFlow
Any computional task can be wrapped as a drag-and-drop node to compose a debuggable, shareable, and reproducible workflow.
Pragmatic and Essential
Let's explore the real-life activities of computational scientists. To complete a computational assignment, they might run Python code with complex conda environment dependencies, process files using various installed software packages, and submit heavy computational tasks to a remote supercomputer center. Some may document these steps in a file for sharing or future reference. However, this back-and-forth trial process can be tedious and error-prone. While some might attempt to write a Python script to automate the entire procedure, the complexity of scenarios often spirals out of control. Sometimes, a simple automation works adequately, but when new assignments arise, they must rewrite the entire script from scratch. Debugging? That's a real challenge. Different software versions, environment conflicts, remote job failures... It's frustrating.
That's why we built BoCoFlow. It's the practical tool we needed but never had – a solution that understands how computational scientists actually work.
Nodes in Conceptual Level
While any computation can be wrapped as a node, the key is finding the right level of abstraction. We treat nodes at a conceptual level that aligns with how computational scientists naturally think about their work, avoiding unnecessary complexity while maintaining flexibility. Moving nodes to less abstract levels would compromise both performance and conceptual clarity.
We organize computational tasks into four distinct layers, each building upon the previous to enable increasingly complex scientific workflows.
L-4. Orchestration Layer (Workflow Level)
The highest level where workflows orchestrate multiple applications and processes to achieve complex scientific objectives. This is where researchers design end-to-end pipelines.
Examples: Drug discovery pipelines, Material optimization workflows, Multi-stage data processing
L-3. Application Layer (Node Level)
Complete software packages and applications that serve as the building blocks of workflows. Each node represents a self-contained computational process.
Examples: Molecular dynamics simulations, Quantum chemistry calculations, Data analysis applications
L-2. Assembly Layer (Component Level)
Reusable functions and operations that combine primitive operations into meaningful computational snippets.
Examples: Force field calculations, Correlation calculations, Statistical analysis routines
L-1. Primitive Layer (Base Level)
The foundational layer of basic numerical operations and elementary computations that support all higher-level functions.
Examples: Vector/Matrix operations, Numerical integration, Basic mathematical calculations
Dependency Delegation
The node layer bridges workflow design and execution. To maintain its conceptual integrity in design, we choose to delegate hardware dependencies in execution to the user. Rather than forcing users into a rigid environment, BoCoFlow empowers them to configure their own computational setup through executable paths, conda environments, or Docker images, ensuring maximum flexibility while maintaining reproducibility.
Integrated Environment
A pre-configured environment optimized for learning and exploration
Key Benefits:
- Zero configuration required
- Verified, tested environment
- Perfect for learning and prototyping
- Instant setup and deployment
Learning-Focused
Ideal for tutorials, workshops, and rapid prototyping where setup time should be minimal
Ready-to-Use Stack
Curated selection of tools and libraries, pre-configured for immediate use
Custom Environment
Full control over your computational environment for production use
Key Benefits:
- Complete environment control
- Seamless integration with existing tools
- Optimized for your infrastructure
- Enhanced security and compliance
Full Configuration
Define your environment using Conda, Docker, or custom configurations
Team Integration
Share and version control environments across your organization
Our Recommendation
While BoCoFlow provides both options, we recommend the Custom Environment approach for production use. The Integrated Environment is perfect for learning and prototyping, but production workflows benefit from the control and optimization possible with custom environments. We provide extensive documentation and templates for both Conda and Docker to help you establish the perfect environment for your needs.
Workflowability
When computational tasks are wrapped as nodes, they gain powerful workflow capabilities that transform them from isolated operations into interconnected, manageable research components. These capabilities emerge naturally from the node architecture, enabling seamless integration into scientific workflows.
Design and Construction
Drag-and-drop Composability
Visually construct workflows by dragging and connecting nodes on a canvas
Example: Combine molecular dynamics simulation with data analysis nodes by simple drag-and-drop
Smart Configurability
Configure node parameters through an intuitive interface with real-time validation
Example: Adjust simulation parameters with type checking and dependency validation
Execution Control
Active Monitoring
Track execution progress, resource usage, and intermediate results in real-time
Example: Monitor convergence metrics during optimization runs
Execution Analytics
Comprehensive debugging suite with execution history, state inspection, and error tracing capabilities
Example: Analyze logs and trace errors in quantum chemistry calculations
Flexible Execution
Skip, rerun, or modify specific nodes without disrupting the entire workflow
Example: Rerun data analysis while keeping simulation results
Computational Traceability
Comprehensive Recording
Record every detail of your computation - from input parameters to execution environment
Example: For a machine learning workflow, automatically record: training data versions, model hyperparameters, GPU/CPU configurations, Python package versions, training timestamps and duration
Enhanced Reproducibility
Complete workflow reproducibility through simple file sharing
Share your entire project with just three components: (1) workflow as a JSON file, (2) custom node implementations as Python files, and (3) execution data and results.
This transformation from isolated computations to workflow-enabled nodes revolutionizes how research is conducted. By handling operational complexity, the system frees researchers to focus on scientific discovery rather than technical overhead. Manual, error-prone processes become reliable, reusable components that can be shared with the scientific community - accelerating research progress and fostering open science collaboration through reproducible workflows.
Technical Essentials
A quick peek under the hood - technical details for the curious
Ready to get started?
Download BoCoFlow and start building your first workflow in minutes. Join us in shaping the future of computational workflows with BoCoFlow.
Download BoCoFlow