New to improve?

Discover how improve transforms data science workflows with complete reproducibility, version control, and regulatory compliance

🏢

Visit Scinteco

Learn about the company behind improve and explore commercial offerings, support options, and enterprise solutions.

Go to scinteco.com

📚

Continue Reading

Dive deep into improve's architecture, capabilities, and workflow examples to understand how it solves real-world challenges.

Scroll to learn more

The Story of improve: Mastering Complexity in Data Science

In the intricate world of data science, particularly within pharmaceutical modeling and simulation, teams struggle with critical challenges that improve was designed to solve:

The Challenges

• The Vanishing Path: How was this result obtained?
• "It Worked on My Machine": Inconsistent environments
• Audit Trail Labyrinth: Regulatory compliance burden
• Knowledge Silos: Valuable work locked away
• Fear of Iteration: Risk of losing successful states

The improve Solution

• Every step captured: Automatic versioning
• Built-in reproducibility: Standardized environments
• Complete lineage: Trace any result to its origin
• Streamlined collaboration: Central source of truth
• Accelerated innovation: Safe exploration

improve is more than just a repository; it's an integrated modeling and simulation infrastructure that empowers teams to perform analysis, modeling, and reporting within a secured, validated, and fully traceable environment.

Understanding improve's Core Concepts

📦 Resources & Versioning

Everything in improve is a Resource - files, folders, workflows, and results. Each resource is automatically versioned, creating an immutable history.

• Files: Datasets, scripts, models, reports
• Links: Stable pointers to specific versions
• Steps: Executable workflow units
• Runs: Records of step executions

🔄 Workflow Management

improve organizes work into Steps - units that define inputs, processing, and outputs. Steps can be chained, branched, and reused.

• Define inputs and configurations
• Execute in controlled environments
• Automatic output versioning
• Create child steps for iterations

🔗 Complete Lineage

Every result can be traced back through its complete history - what data, code, and environment produced it.

• Full input-output relationships
• Execution environment details
• User actions and timestamps
• Regulatory audit trail

👥 Team Collaboration

A centralized repository ensures everyone works with the same data and can build upon each other's work.

• Shared workspace for teams
• Formal review processes
• Commenting and annotations
• Access control and permissions

Real-World Example: Pharmaceutical Modeling Workflow

📊

1. Data Import & QC

Clinical data imported, quality checked, and prepared for analysis

🧮

2. Model Development

Base model created, refined through iterations, all versions tracked

📈

3. Results & Reporting

Visualizations created, reports generated with full lineage

How improve Handles This Workflow

Data Import

A Step is created with an import script. When executed, it creates a Run that stores the imported data as a versioned File. The original data source and import parameters are permanently linked.

With "Data Order" there is also a web interface helping the user to search your study store and create the data specification. This specification is then used by improve to import the data and create the versioned File.

Quality Control

A new Step links to the imported data (ensuring the original isn't modified). The QC script and results are added to a formal Review for team approval.

Based on the data specification you can directly provide pre reviewed QC scripts that are run automatically after data import Step. This ensures that data is always checked in the same way and that the QC scripts are already reviewed speeding up the process.

Improve also helps standardising your QC scripts by tracking all copies of scripts and their changes, helping to find all executions and adaptions.

Model Refinement

A child step is like a branch in git. It inherits everything but the results from the parent step, and allows modifications. This is perfect for model iterations.

Descriptions and rationals can be provided for each model iteration, preserving the complete decision tree. Each refinement maintains links to its parent, creating an "analysis tree" that documents the entire modeling journey.

Reporting

Reports in improve are usually done using quarto, rmarkdown or latex. Every result is generated in a Step, ensuring full traceability.

Reports include Links to visualizations, data, and results. improve can trace every element back to its origin, providing complete transparency for regulatory submissions from data import to reviewed report.

Key Capabilities for Your Team

🔄

Reproducibility

• Standardized execution environments
• Captured tool versions and dependencies
• Exact re-execution capability
• Environment reproducibility

🔧

Tool Integration

• NONMEM, Monolix, SAS support
• R and Python ecosystems
• HPC cluster integration
• RESTful API for automation

✅

Regulatory Ready

• 21 CFR Part 11 compliance
• Complete audit trails
• Access control and permissions
• Data integrity assurance

👥

Collaboration

• Centralized repository
• Formal review processes
• Resource commenting
• Workflow sharing

⚡

Scalability

• Local to HPC execution
• Parallel processing support
• Automated output parsing
• Real-time monitoring

🔒

Security

• Granular access control
• Secure authentication (OAuth 2.0)
• Data integrity verification
• High availability options

Ready to Get Started?

improve seamlessly integrates with the tools your team already knows and trusts, scaling from individual workstations to high-performance computing clusters.

Learn More at Scinteco RStudio Integration Back to Homepage