r/ScientificComputing • u/johlars • 5d ago
r/ScientificComputing • u/amberdrake • 6d ago
Mushku.com - secret search, secretly
Howdy,
The issue I had: search data I had limited access to.
Resolution: client side Ionizer encoder + SaaS Gravitas search engine
Ionizer is another implementation of patent pending oss repo OpenEncoder.
Ionizer encodes your data on your machine, creates a single envelope specified in the patent and oss repo(all encoders following the specification are allowed)
This envelop is a single field tensor for each corpus and query.
Gravitas is the zero knowledge verified oblivious oracle. A blind answer machine.
No data egress, no SOX/HIPAA etc not triggered as your data never leaves your control. Only a description in a single field tensor that is easily under 256kb. Two of those, for corpus and query, and Gravitas returns the answer field you decode and it maps back to what you asked.
Full verifiably zksnark/groth16 output default from ionizer and gravitas with every output.
Please let me know your thoughts!
r/ScientificComputing • u/PeterBrobby • 12d ago
Introducing Integration Methods
The video explores:
• Numerical integration
• Taylor series truncation error
• Local vs global error
• Forward Euler, Backward Euler and Symplectic Euler
• Stability and energy drift
• Why symplectic methods are favoured in physics engines
r/ScientificComputing • u/BlusLoopedMirror • 12d ago
Audited 512³ split-step quantum-state simulation on an i7 laptop — evidence packet included
I’m an independent researcher in Cairo working on CPU-first numerical simulation and reproducible solver evidence.
I recently released a bounded solver-evidence paper and SHA-256 locked artifact packet:
Audited Laptop-Scale 512³ Quantum-State Simulation: A REPA-Governed Solver Stack Beyond the Cluster-Only Assumption
DOI: https://zenodo.org/records/20247942
The claim is narrow:
- 512³ internal-state complex split-step simulation using a oneAPI CPU backend on an Intel i7 laptop-class machine
- persisted outputs are 2D amplitude/phase slice planes, not full 512³ volume dumps
- separate Crank–Nicolson Hermitian conservation validation
- separate GMRES/multigrid comparison against a PARDISO direct-solve oracle at calibration scale
- dimension-tagged evidence matrix to prevent merging solver lanes
What I am not claiming:
- not 512³ Crank–Nicolson execution
- not 512³ GMRES/PARDISO parity
- not cluster obsolescence in general
- not proof of any AI/identity theory attached to the broader research program
I’m looking for hostile technical review: numerical issues, memory-accounting mistakes, evidence-boundary problems, reproduction suggestions, or places where the public claim should be narrowed.
Paper/evidence packet:
https://zenodo.org/records/20247942GitHub:
https://github.com/ChasingBlu/RECP_evidence
r/ScientificComputing • u/Anxious_Tool • 14d ago
MCP server for the TLA+ model checker tla-rs
r/ScientificComputing • u/EreNN_42 • 21d ago
I built an open-source ML pipeline for lithium-ion cathode screening — looking for feedback
cathode-screening.vercel.appHi everyone,
I’ve been working on an open-source machine learning pipeline for lithium-ion battery cathode screening:
https://github.com/ErenAri/CathodeX
The goal is not to replace DFT, but to act as a pre-screening layer before expensive DFT validation. The system predicts energy above hull (E_hull) for candidate cathode materials and classifies them into KEEP / MAYBE / KILL decisions based on uncertainty-aware thresholds.
Current technical direction:
- 5-member MACE-MP-0 fine-tuned ensemble
- CHGNet and CGCNN fallback support
- E_hull prediction for transition metal oxide cathode candidates
- Quantile outputs: q10 / q50 / q90
- Epistemic + aleatoric uncertainty estimation
- Conformal calibration for prediction intervals
- SOAP-LOCO-style validation to test generalization to structurally different materials
- Automated governance checks for ranking, calibration, false-kill rate, KEEP precision, and decision validity
- FastAPI backend + Next.js frontend
- DFT verification workflow direction using Quantum Espresso
The repository currently reports strong in-distribution test metrics, but also clearly shows a major limitation: LOCO generalization is much weaker. I’m trying to make the project honest about where the model is useful and where it should not be trusted without additional validation.
I would especially appreciate feedback on:
Whether the validation methodology is strict enough
Whether the KEEP / MAYBE / KILL policy is scientifically reasonable
Whether the uncertainty and calibration story is convincing
What would make this more useful for actual computational materials researchers
Whether the README communicates the limitations clearly enough
This is not a claim of discovering DFT-verified new cathodes yet. It is an open-source screening and model-governance pipeline intended to reduce the candidate space before deeper simulation or expert review.
Any criticism from materials science, computational chemistry, battery research, or scientific ML people would be very useful.
r/ScientificComputing • u/Pure_Treat6246 • 22d ago
PhysCC: A DSL Compiler for Physics Simulations (SYCL, MPI, AVX2)
I’ve been working on PhysCC, an open-source tool designed to bridge the gap between high-level physics equations and low-level hardware optimization.
The problem: Writing boilerplate for SYCL, MPI, or AVX2 stencils is tedious. The solution: You write a simple equation like u = u + dt * lap(u) and PhysCC generates the optimized backend code.
Key Features:
- Multi-backend support (Single-core, OpenMP, MPI, SYCL, CUDA).
- AI-informed pass: It analyzes the PDE type (Hyperbolic, Parabolic, Elliptic) and suggests optimal work-group sizes for Intel Iris Xe.
- Built-in visualization script for heatmaps.
It’s still a work in progress, but I’d love to hear your thoughts on the codegen or the feature extraction logic!
https://github.com/NikosPappas/PhysCC
r/ScientificComputing • u/hconel • 23d ago
Two identical MPI jobs slow down drastically on Intel Alder Lake but not on Threadripper. Is it normal?
Hi everyone,
I regularly run multiple parallel MPI jobs simultaneously on my workstations. I have two systems:
- Intel i7-12700 (12 cores: 8 P-cores + 4 E-cores), OS: Ubuntu 20.04
- AMD Threadripper 3960X (24 cores, 48 threads), OS: Ubuntu 18.04
I wrote a simple C++ MPI test program that runs with mpirun -np 2. On both machines, a single instance finishes in about 12 seconds.
The problem appears when I run two instances at the same time (both mpirun -np 2):
- Threadripper: Both finish in ~12 seconds (no slowdown)
- Intel: Both take ~30 seconds (significant slowdown)
I tried pinning processes to specific cores using taskset and --cpu-set in mpirun. The processes do land on the correct cores (I verified with ps), but the slowdown persists.
Is this expected behavior for Alder Lake? Could the hybrid P-core/E-core architecture be causing memory bandwidth contention? Or am I missing something else?
I'm trying to figure out if my Intel system is performing normally or if I should be hunting for a configuration issue.
Additional notes:
- My code shows reasonable&normal speed-up with increasing core numbers on both systems
- The Intel PC has only one memory stick
- The AMD PC has multiple memory sticks
- My test code is not memory intensive (mostly CPU math)
I can provide more details if needed. I'm not super knowledgeable about CPU architectures, so apologies in advance.
Thanks for any insights!
r/ScientificComputing • u/Entphorse • 27d ago
Geant4-DNA track-structure Monte Carlo running in a browser tab via WebGPU — validated against Karamitros 2011, no install
Geant4-DNA is the CNRS/IN2P3-coordinated Monte Carlo toolkit for track-structure radiobiology—the gold-standard reference for cancer radiotherapy dose calibration and astronaut radiation exposure modeling.
It normally runs as C++ on a CPU and requires a significant machine to set up. I ported the per-electron physics + Karamitros 2011 IRT chemistry to WebGPU. It now runs in any browser tab on a laptop GPU with no installation required.
- Live Demo: webgpudna.com/see (4D viewer, scrubbable in time from t=0 to 1 μs)
- Code (MIT): github.com/abgnydn/webgpu-dna
Validation against Geant4-DNA 11.3.0
(4096 primaries @ 10 keV in liquid water)
| Metric | This Build | Reference | Ratio |
|---|---|---|---|
| CSDA range | 2714.4 nm | 2756.5 nm | 0.985× |
| Energy conservation | 100.0% | 100.0% | 1.000× |
| Ions per primary | ≈509 | 509.1 | 1.00× |
| G(OH) at 1 μs | 1.55 | 2.50 (Karamitros 2011) | 0.62× ^1 |
| G(e⁻aq) at 1 μs | 1.41 | 2.50 | 0.56× ^1 |
| G(H) at 1 μs | 0.71 | 0.57 | 1.24× |
| G(H₂O₂) at 1 μs | 0.60 | 0.73 | 0.83× |
| G(H₂) at 1 μs | 0.47 | 0.42 | 1.11× |
^1 Karamitros reference is at ~1 MeV (low-LET); the runs here are at 10 keV (high-LET), which suppresses G(OH) and G(e⁻aq) due to denser intra-track recombination. Other G-values are within 25%.
How it works
- Physics: One GPU thread per primary electron. The full interaction chain (ionization, excitation, elastic scatter to track end) is fused into a single compute dispatch via WGSL. Secondaries are processed in waves until the population is depleted.
- Chemistry: Karamitros 2011 IRT chemistry runs in a Web Worker, followed by SSB/DSB scoring on a 21×21 B-DNA fiber grid.
- Data: Cross sections are sourced from G4EMLOW 8.8 (shipped with Geant4 11.4.1).
Performance & Benchmarks
The kernel-fusion pattern used here is the same one I benchmarked across 92 devices (Rastrigin, N-body, Monte Carlo Pi, RL environments, transformer decoding). Medians show:
- 71× on Apple Silicon
- 56× on NVIDIA
- 20× on mobile phones
- Peaks: 226× / 402× / 103× respectively.
Detailed benchmarks are live at kernelfusion.dev and gpubench.dev. Headline claims include 720× CUDA-over-PyTorch (T4) and 159× WebGPU-over-PyTorch (M2), confirmed across CUDA, WebGPU, JAX, and Triton.
Why this project?
The radiobiology target came from my brother-in-law (a physicist and researcher). He suggested Geant4-DNA because of its decades of published reference data, allowing a port to be rigorously validated rather than just "demoed."
Migration was assisted by Claude Code. I am a software engineer focused on browser-native scientific computing, not a radiobiologist, and the validation harness is also AI-generated. If anyone wants to review the WGSL or the comparison harness in validation/compare.py, I would greatly value the feedback!
r/ScientificComputing • u/Samosho17 • 27d ago
I built an N-body orbital simulator in Python and I’d like some honest feedback
I’ve been working on a small project to simulate orbital mechanics (multi-body gravity + impulsive maneuvers). It uses numerical integration (solve_ivp - RK8) and supports things like transfers and custom Δv inputs.
Here’s the repo:
https://github.com/Samsaj04/N-Body-Orbital-Simulator.git
And here’s a short GIF of the simulation:

What I’d really like feedback related to if my physics implementation structured correctly for an N-body setup, and what should i do to improve performance or expand even more my program?
Thanks.
r/ScientificComputing • u/Opt4Deck • 27d ago
Parameter estimation with Adjoint: why does it converge so fast?
r/ScientificComputing • u/Dependent-Mud-6146 • May 03 '26
Workstation build for CPU-heavy scientific computing: $6800 grant, 128–256 GB RAM target
r/ScientificComputing • u/Active_Television_10 • May 03 '26
Stability vs. Divergence: A Computational Study of Parameter Space for Nonlinear Root-Finding
r/ScientificComputing • u/TheMakpu • Apr 29 '26
Heat2D: a C++ heat equation solver in 2D
Hi all!
I recently built a small C++ project to solve the generalized heat equation in 2D rectangular domains, mainly as a way to better understand PDE solver design and experiment with separating between spatial discretization and time integration.
The architecture of the project is modular:
- Spatial discretization (currently finite differences with Eigen sparse matrices)
- Time integration (Explicit/Implicit Euler, Crank–Nicolson)
The idea is to make it easy to plug different methods in each part (I’m planning to add finite elements soon). There are also a few example simulations (moving heat source, colliding pulses, etc.) with GIFs in the README.
I’d really appreciate some feedback, especially on:
- The overall design/architecture
- Performance considerations
- Whether this approach makes sense vs more template-heavy designs
r/ScientificComputing • u/rxptutoring • Apr 29 '26
reionemu v0.2.0 - Modular PyTorch emulator for kinetic SZ power spectrum from reionization simulations
Hi [r/ScientificComputing](r/ScientificComputing),
I just released **reionemu**, a Python package for building fast neural network emulators of the kinetic Sunyaev-Zel'dovich (kSZ) angular power spectrum using outputs from 2LPT reionization simulations.
It includes a clean pipeline:
- Simulation I/O and flat-sky power spectrum computation
- Data loading + normalization (HDF5)
- PyTorch models with optional MC-dropout uncertainty
- Hyperparameter tuning with Ray Tune
- Reproducibility-focused experiment artifacts
GitHub: https://github.com/RobertxPearce/reionization-emulator
Docs: https://robertxpearce.github.io/reionization-emulator/
Would appreciate feedback from anyone working on scientific ML, surrogate modeling, or high-performance scientific Python tools.
Questions welcome!
r/ScientificComputing • u/galoo123 • Apr 20 '26
WfmOxide - a zero-copy parser for proprietary oscilloscope binary files
hello everyone, for the longest time i have been using python parsers to get data into numpy from binary files in my lab. while they work, the execution latency started getting on my nerves as our datasets grew. waiting for the interpreter to comb through hundreds of deep-memory binary files was just taking too long. as one does when they hit a wall with python, i started looking into faster alternatives. naturally, rust was at the top of my list. i wanted to see if i could build a backend that made the parsing process feel instant, so i started working on this little project. i’ve been using it around the lab and with a few friends for a while now. it turned out significantly faster than i expected, so i decided to generalize it and put it on github for anyone else stuck.
to make it work, i used memmap2 to map binary files directly into virtual memory to avoid those standard ram spikes and the overhead of loading raw payloads. by releasing the python gil and utilizing rayon, the parser can de-interleave adc bytes across every available cpu core simultaneously. the rust core writes data directly into a contiguous memory buffer that is handed to the python runtime as a float32 numpy array without any secondary copying.
i tested this on my daily driver, a thinkpad t470s (intel i5-6300u), to see what it could do on resource-constrained lab hardware. i was kinda blown away—rust blew my mind. i got sub-millisecond execution on parsing the metadata and for end-to-end extractions of a 12mb rigol capture that took 375.2 ms in pure python, it now finishes in 53.5 ms on my 9-year-old laptop.
it’s been tailored for our specific needs, but i’ve tried my best to make it flexible for others. it currently supports rigol (ds1000z, ds1000e/d, ds2000) and tektronix (wfm#001-003) families. if anybody wants to check it out here is the github: https://github.com/SGavrl/WfmOxide and you can also just pip install wfm-oxide now. feedback is more than welcome, especially if you have different .wfm file versions or suggestions on the pyo3/rust bridge implementation.
r/ScientificComputing • u/OddHoneydew968 • Apr 18 '26
Need some advice
I’m an incoming freshman planning to go into numerical methods / scientific computing, and I’d appreciate some perspective from people actually in the field.
My research interests are numerical methods for PDEs: high-order spatial discretization (like FEM, DG, IGA), time integration (IMEX, GLMs, multirate), and linear solvers (multigrid and preconditionding especially). Im also focused on applying them to real things like computational mechanics (cfd especially), and contributing to the software side.
I had the option to attend MIT or Stanford, but chose UT Austin mainly for the Oden Institute, early research access, TACC resources, and a full ride I have there. I already have research connections there and am already involved, so I’d be able to to go quickly.
My question is basically: for someone aiming at grad school or research heavy roles in computational science and math, how much does undergrad prestige actually matter? Does being at a place that’s particularly strong in this niche (UT/Oden) outweigh the broader signaling advantage of MIT/Stanford in the long run? I'm having some doubts over the choice I made.
Would really appreciate input from people whove gone through this path. Thank you!
r/ScientificComputing • u/Defiant_Confection15 • Apr 16 '26
No matrix multiplication. No GPU. Formally verified to silicon. One repo.
:
git clone https://github.com/spektre-labs/creation-os
Cognitive architecture. v25. SystemVerilog targeting SkyWater 130nm. Formally verified with SymbiYosys. XNOR binding replaces softmax — 87,000× fewer ops. Ternary weights, zero float math. Abstains when uncertain instead of hallucinating.
r/ScientificComputing • u/Distinct-Contest-876 • Apr 15 '26
Former lab researcher built a browser-based segmentation tool so biologists don't need to touch a terminal
I'm a software engineer, but before that I worked in academic labs and noticed that getting quantitative data out of fluorescence images is way harder than it should be. Tight budgets mean aging hardware, and the hour-long technical setup just to run a segmentation pipeline feels like a lot when you just want clean data.
So I built Phenora, you upload your fluorescence images (.tif and .ome-tif for now), assign channels, run Cellpose or StarDist on a GPU in the cloud, and download a CSV with per-cell measurements: area, diameter, circularity, mean intensity per channel, centroid, border flag, confidence score. Z-stacks get max-intensity projected automatically, and there's per-channel preprocessing if you need it.
Curious whether other labs have found better solutions, and what measurements or workflow steps would make this actually useful for how you run imaging experiments in your lab?
r/ScientificComputing • u/victotronics • Apr 10 '26
How does Conjugate Gradients deal with singular systems?
r/ScientificComputing • u/XZark • Apr 06 '26
MCP server that connects Claude/Codex/VS Code to your local Mathematica
r/ScientificComputing • u/avocadosoccer • Apr 05 '26
Can Courant MS Scientific Computing be a gateway to Quant Finance or Big Tech?
depending on electives it seems like it could be a good match for either career path/internship. Is that realistic?
r/ScientificComputing • u/AbhyuBoi • Apr 05 '26
Best path into computational science/scientific computing?
Hello all!
I finished my A-Levels last year and am a bit confused about what I should do a Bachelors in.
Would a bachelor's in Physics/Math/CS followed by a masters in scientific computing/computational science be better than doing a computational bachelors (like Computational and Data Science (KIT) or Computational Engineering Science (RWTH Aachen))?
I'm really interested in math and simulating physics, but I'm really not sure what path to take.
Any advice would be greatly appreciated!
P.S. what's the difference between computational science and scientific computing? Most sites online use them interchangeably so that adds to the confusion.
r/ScientificComputing • u/MekataRupma • Mar 30 '26
Which Linux distro to choose for Computational Physics?
I'm confused between Pop!OS, FedoraKDE, CachyOS, AlmaLinux, and Ubuntu. I have Nvidia graphics card on my laptop with a CPU that has an iGPU in it and I wanna be able to switch between iGPU and dGPU for lighter and heavier tasks when needed on Linux, but I dual boot with windows for gaming and fun. Linux is only for work and study. I want decent customisation, compatibility with all softwares needed for my research, comparatively newer softwares so I don't have to run old softwares like with Debian, easy bug fixes, and stability so that my system doesn't crash on updates all the time.
r/ScientificComputing • u/Georgiou1226 • Mar 27 '26