Website Update Commands
Website Update Commands
This guide contains all the commands you need to update your clean, optimized academic website.
Note: The codebase has been cleaned up - removed 20+ unnecessary template files, unused layouts, demo content, and duplicate files. See
STRUCTURE.md
for full details.
Quick Update (Recommended)
For a complete website refresh with latest publications and clean organization:
cd /Users/nesar/Projects/Misc/nesar.github.io
python3 scripts/cleanup_and_organize.py
This single command will:
- ✅ Refresh research images by extracting figures from papers
- ✅ Update publications page with single Paper link logic
- ✅ Analyze all 60+ publications and categorize them correctly
- ✅ Ensure exhaustive publication lists for each research area
- ✅ Select diverse figures from different papers (max 2 per category)
- ✅ Create clean portfolio pages without duplicates
- ✅ Generate a clean research overview page
- ✅ Fix any content issues automatically
Individual Update Commands
1. Refresh Research Images Only
If you just want to update the research figures:
# Auto-extract figures from all publications
python3 scripts/auto_extract_from_publications.py
# Extract figures from a specific PDF
python3 scripts/extract_figures.py /path/to/new_paper.pdf
2. Update Publications Data
To fetch latest publications from Google Scholar/arXiv:
python3 scripts/update_scholar_publications.py
3. Validate Website Structure
To check for issues without making changes:
# Check for duplicate content
grep -r "Machine Learning & AI" _portfolio/ _pages/
# Count publications per category
python3 -c "
import os, re
from collections import defaultdict
categories = defaultdict(int)
for f in os.listdir('_publications'):
if f.endswith('.md'):
with open(f'_publications/{f}', 'r') as file:
content = file.read()
if any(k in content.lower() for k in ['machine learning', 'deep learning', 'ai ']):
categories['ML'] += 1
elif any(k in content.lower() for k in ['dark matter', 'cosmic web', 'cosmology']):
categories['Dark Matter'] += 1
elif any(k in content.lower() for k in ['uncertainty', 'probabilistic']):
categories['UQ'] += 1
elif any(k in content.lower() for k in ['emulator', 'surrogate']):
categories['Emulation'] += 1
for cat, count in categories.items():
print(f'{cat}: {count} papers')
"
Testing Locally
Before deploying changes:
# Install Jekyll dependencies (one-time setup)
bundle install
# Serve locally
bundle exec jekyll serve
# View at: http://localhost:4000
Deployment
Your website auto-deploys via GitHub Pages when you push to master:
git add .
git commit -m "Update research content and figures"
git push origin master
File Structure Reference
_portfolio/
├── portfolio-1-machine-learning.md # 24 ML papers
├── portfolio-2-dark-matter.md # 10 Dark Matter papers
├── portfolio-3-uncertainty-quantification.md # 4 UQ papers
└── portfolio-4-statistical-emulation.md # 10 Statistical Emulation papers
_pages/
└── research.html # Clean overview page
images/research/figures/ # Extracted figures
scripts/
├── cleanup_and_organize.py # Main update script
├── extract_figures.py # Figure extraction
└── update_scholar_publications.py # Publication updates
Troubleshooting
If figures aren’t showing:
# Check if figure files exist
ls -la images/research/figures/
# Regenerate figures
python3 scripts/extract_figures.py
If duplicates appear:
# Run cleanup (fixes all duplicates)
python3 scripts/cleanup_and_organize.py
If publication counts are wrong:
# Force refresh all publications
python3 scripts/cleanup_and_organize.py
Automation Schedule
For regular updates, you can set up a cron job:
# Edit crontab
crontab -e
# Add line for monthly updates (runs 1st of each month at 9 AM)
0 9 1 * * cd /Users/nesar/Projects/Misc/nesar.github.io && python3 scripts/cleanup_and_organize.py
Current Status
After running cleanup_and_organize.py
, your website has:
- ✅ Foundation Models: LLM and foundation model research, 2 figures
- ✅ Machine Learning for Science: Non-LLM ML applications, 2 figures
- ✅ Dark Matter & Cosmology: Cosmological simulations, 2 figures
- ✅ Emulation & Inference: Combined UQ and emulation methods, 2 figures
- ✅ Clean research overview with no duplicates
- ✅ Single “Paper” links (published version or arXiv fallback)
- ✅ Publications properly categorized by research area
Support
If you encounter issues:
- Check this file for relevant commands
- Run
python3 scripts/cleanup_and_organize.py
to fix most problems - Check the console output for specific error messages