Logistics
- Will seek possible support from Intel, Amazon, MS, Google, others?
- Formally plan to make this a student-run organization
- Logistics: Monday 6-7
- Format: Tutorial/discussion + lightning talks
- Goal is to get this self-sustaining by the end of academic year
- Organizational models for learning about software
Topics
- At the terminal
- Version control
- Language environments and tooling
- Documentation
- Scientific computing in the cloud
- Cornell resources for scientific computing
- CAC,
CIT,
RDMSG, and other groups
- (2016-09-26)
The Totient cluster
- Local courses and workshops
- Student-run tutorials
- Workflow automation and documentation
- Getting computing resources for scientific computing
- Overview: Servers, clusters, clouds, supercomputers
- Cornell-local resources
- Supercomputing resources
- Cloud resources
- Environment virtualization
- Virtual machines
- Docker and containers
- Language level: Python virtualenv and conda
- Reproducibility issues
- Reproducibility best practices
- Special issues with reproducibility and high-performance computing
- Build and configuration
- Packaging and distribution
- Semantic versioning
- Automating distribution
- Packaging of compiled codes
- Packaging in Python, Julia, R
- Software licensing and copyright issues
- Data management
- “Just enough” SQL
- HDF5,
NetCDF,
and related formats
- XML,
YAML,
JSON, metadata, and semi-structured data
- Facilities for large-scale working data
- Data set archival and dissemination:
figshare,
zenodo,
Dryad,
re3data,
eCommons
- Plotting and visualization
- Testing and company
- Linting code
- Code reviews and tools
- Valgrind and company
- Types of tests: unit, integration, regression, etc
- Test-driven development
- Tooling for test automation
- Continuous integration tools and services
- Tickets and bug databases
- Special issues in testing of numerical codes
- Monitoring and checkpointing
- Assertions and exceptions
- Logging systems
- Automated checkpoints
- Application-level checkpointing
- Tuning
- “Just enough” computer architecture
- Profiling for scripts and compiled codes
- Profile-guided optimization
- Sources for tuned libraries
- Accelerator-aided libraries
- Software modernization resources
- Mixed language programming
- Models: script-driven, embedded scripting, configuration languages,
cross-language library calls, code generation
- Interface generators
- Function calls vs inter-process communication
- Build-time issues: portability, standard libraries, linkage
- Fast math frameworks and libraries