Publications by Year (* denotes my advisees)

2024

[SC'24] Hariharan Devarajan, Loic Pottier, Kaushik Velusamy, Huihuo Zheng, Izzet Yildirim, Olga Kogiou, Weikuan Yu, Anthony Kougkas, Xian-He Sun, Jae Seung Yeom, and Kathryn Mohror. DFTracer: An Analysis-Friendly Data Flow Tracer for AI-Driven Workflows. Supercomputing 2024. Atlanta, GA. November 2024.
[GeoInformatica'24] S. Prabakar, H. Chen, Z. Jiang, C. Yang, W. Yu, and D. Yan. LENS: label sparsity-tolerant adversarial learning on spatial deceptive reviews. GeoInformatica, 2024/09/14 2024, doi: 10.1007/s10707-024-00529-5.
[REXIO'24] Olga Kogiou, Hariharan Devarajan, Chen Wang, Weikuan Yu and Kathryn Mohror. Understanding Adaptable Storage for Diverse Workloads. REX-IO Workshop. Held in conjuction with Cluster 2024.

2023

[Poster] Olga Kogiou, Hariharan Devarajan, Chen Wang, Weikuan Yu, Kathryn M. Mohror: I/O characterization and performance evaluation of large-scale storage architectures for heterogeneous workloads. CLUSTER Workshops 2023: 44-45.

2022

[Cluster'22] Ismail Ataie, Weikuan Yu. SVAGC: Garbage Collection with a Scalable Virtual Address Swapping Technique. September 2022. IEEE Cluster. Heidelberg, Germany.
[IPDPS'22] Fahim Tahmid Chowdhury, Francesco Di Natale, Adam Moody, Kathryn Mohror, Weikuan Yu. DFMan: A Graph-based Optimization of Dataflow Scheduling on High-Performance Computing Systems . May 2022. 36th IEEE International Parallel and Distributed Processing Symposium. Lyon, France.
[TPDS'22] Z Li*, B Jiao*, S He, W Yu. PHAST: Hierarchical Concurrent Log-Free Skip List for Persistent Memory IEEE Transactions on Parallel and Distributed Systems. In Press. * authors with equal contribution.
[BSPC'22] Xingang Fang, Julia Klawohn, Alexander De Sabatino, Harsh Kundnani, Jonathan Ryan, Weikuan Yu, Greg Hajcak Accurate classification of depression through optimized machine learning models on high-dimensional noisy data. 2022/1/1. Journal Biomedical Signal Processing and Control Volume 71, Pages 103237, Elsevier.

2021

[Cluster'21] S Bhattacharya, W Yu, FT Chowdhury, K Mohror. O (1) Communication for Distributed SGD through Two-Level Gradient Averaging 2021 IEEE International Conference on Cluster Computing (CLUSTER), 332-343.
[CCGrid'21] Rupak Roy, Kento Sato, Subhadeep Bhattachrya, Xingang Fang, Yasumasa Joti, Takaki Hatsui, Toshiyuki Nishiyama Hiraki, Jian Guo, Weikuan Yu. Compression of Time Evolutionary Image Data through Predictive Deep Neural Networks. May 2021. 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid) Pages 41-50.
[ICPP'21] Md Muhib Khan, Weikuan Yu. ROBOTune: High-Dimensional Configuration Tuning for Cluster-Based Data Analytics. August 2021. 50th International Conference on Parallel Processing. Pages 1-10

2020

[arXiv'20] Subhadeep Bhattacharya*, Weikuan Yu, Fahim Tahmid Chowdhury*. O(1) Communication for Distributed SGD through Two-Level Gradient Averaging. June 2020. arXiv preprint.
[HPS'20] T. Dey*, K. Sato, B. Nicolae, J. Guo, J. Domke, W. Yu, F. Cappello and K. Mohror. Optimizing Asynchronous Multi-level Checkpoint/Restart Configurations with Machine Learning. High-Performance Storage workshop, held in conjunction with IPDPS'20. May 2020.
[JCST'20] Andre Brinkmann, Kathryn Mohror, Weikuan Yu, Philip Carns, Toni Cortes, Scott A. Klasky, Alberto Miranda, Franz-Josef Pfreundt, Robert B. Ross, and Marc-André Vef. Ad hoc File systems for HPC. Journal of Computer Science and Technology. Vol 35-1. Pages 4-26. January 2020.

2019

[Cluster'19] Z. Yue*, W. Yu, B. Jiao*, K. Mohror, A. Moody, and F. Chowdhury*. Efficient User-Level Storage Disaggregation for Deep Learning. International Conference on Cluster Computing. September 2019, Albuquerque, NM.
[ICPP'19] F. Chowdhury*, Y. Zhu*, T. Heer, S. Paredes*, A. Moody, R. Goldstone, K. Mohror, W. Yu. I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning. The 2019 International Conference on Parallel Processing. August 2019. Kyoto, Japan.
[ISMM'19] M. Khan*, Ahad Alam*, A. Nath*, W. Yu. Exploration of Memory Hybridization for RDD Caching in Spark. The 2019 International Symposium on Memory Management. June 2019. Phoenix, AZ.
[ParCo'19]: Z. Liu*, A. Nath*, X. Ding, H. Fu*, M. Khan, W. Yu. Multivariate Modeling and Two-Level Scheduling of Analytic Queries Journal of Parallel Computing. DOI (https://doi.org/10.1016/j.parco.2019.01.006). February 2019.

2018

[OpenSHMEM'18] Subhadeep Bhattacharya*, Shaeke Salman*, Manjunath Gorentla Venkata, Harsh Kundnani*, Neena Imam and Weikuan Yu. An Initial Implementation of Libfabric Conduit for OpenSHMEM-X . Fifth workshop on OpenSHMEM and Related Technologies(OpenSHMEM'18). Baltimore, Maryland. August 2018.
[ROSS'18] Z. Yue*, T. Wang*, K. Mohror, A. Moody, K. Sato, M. Khan and W. Yu. Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support. 8th International Workshop on Runtime and Operating Systems for Supercomputers. June 2018. Tempe, AZ.
[MASCOTS'18] Z. Yue*, F. Chowdhury*, H. Fu*, A. Moody, K. Mohror, K. Sato, and W. Yu. Entropy-Aware I/O Pipelining for Large-Scale Deep Learning on HPC Systems. 26th International Conference on Modeling, Analysis and Simulation of Computer and Telecommunication Systems. September 2018, Chicago, IL. (Acceptance rate: 30%).
[CCGrid'18] H. Fu*, M. Gorentla Venkata, Shaeke Salman*, N. Imam, and W. Yu. SHMEMGraph: Efficient and Balanced Graph Processing Using One-sided Communication. 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). Washington, DC. (Acceptance rate: 21%). May 2018.

2017

[JCC'17] Zhuo Liu*, Bin Wang*, and W. Yu. HALO: a fast and durable disk write cache using phase change memory. Journal of Cluster Computing. 2017. In Press. Paper
[ParCo'17]: L. Shi*, Z. Wang, W. Yu, X. Meng. A Case Study of Tuning MapReduce for Efficient Bioinformatics in the Cloud. Journal of Parallel Computing. Volume 61, January 2017, Pages 83-95.
[ParCo'17]: H. Fu*, H. Chen, Y. Zhu*, W. Yu. FARMS: Efficient MapReduce Speculation for Failure Recovery in Short Jobs. Journal of Parallel Computing. Volume 61, January 2017, Pages 68-82.
[OpenSHMEM'17] H. Fu*, M. Gorentla Venkata, N. Imam, and W. Yu. Portable SHMEMCache: A High-Performance Key-Value store on OpenSHMEM and MPI.. Fourth workshop on OpenSHMEM and Related Technologies(OpenSHMEM'17). Annapolis, Maryland. August 2017.
[CCGrid'17] H. Fu*, M. Gorentla Venkata, A. Roy Choudhury*, N. Imam, and W. Yu. High-Performance Key-Value Store On OpenSHMEM. 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). Madrid, Spain. (Acceptance rate: 23%). May 2017.
[IPDPS'17] T. Wang*, A. Moody, Y. Zhu*, K Mohror, K. Sato, T. Islam, and W. Yu. MetaKV: A Key-Value Store for Metadata Management of Distributed Burst Buffers. 31st IEEE International Parallel and Distributed Processing Symposium. Orlando, FL. (Acceptance rate: 22%). May 2017.

2016

[SC'16]: T. Wang*, K. Mohror, A. Moody, K. Sato, W. Yu. An Ephemeral Burst-Buffer File System for Scientific Applications. International Conference for High performance Computing Networking, Storage and Analysis. Salt Lake City, Utah. November 2016. (Acceptance rate: 18%).
[PACT'16]: B. Wang*, Y. Zhu*, W. Yu. OAWS: Memory Occlusion Aware Warp Scheduling. International Conference on Parallel Architecture and Compilation Techniques (PACT 2016). September 2016. (Acceptance rate: 26%). Haifa, Israel. Paper.
[TPDS'16]: C. Xu*, R. Goldstone, Z. Liu*, H. Chen*, B. Neitzel, W. Yu. Exploiting Analytics Shipping with Virtualized MapReduce on HPC Backend Storage Servers. IEEE Transactions on Parallel and Distributed Systems. Paper.
[IJHPCA]: Teng Wang*, Kevin Vasko*, Zhuo Liu*, Hui Chen*, Weikuan Yu. Enhance Scientific Application I/O with Cross-Bundle Aggregation. International Journal of High Performance Computing. Paper.

2015

[NAS'15]: Fang Zhou*, Hai Pham*, Jianhui Yue*, Hao Zou* and Weikuan Yu. SFMapReduce: An Optimized MapReduce Framework for Small Files. IEEE International Conference on Network, Architecture and Storage (NAS). August 2015, Boston, MA. Paper.
[MASCOTS'15]: X. Wang*, B. Wang*, Z. Liu*, W. Yu. Preserving Row Buffer Locality for PCM Wear-Leveling Under Massive Parallelism. 23rd International Conference on Modeling, Analysis and Simulation of Computer and Telecommunication Systems. October 2015, Atlanta, GA. Paper.
[Cluster'15]: T. Wang*, H.S. Oral, H. Pritchard*, B. Wang* and W. Yu. TRIO: Burst Buffer Based I/O Orchestration. IEEE International Conference on Cluster Computing. September 2015, Chicago, IL. Paper.
[ICS'15]: B. Wang*, W. Yu, X.H. Sun, X. Wang. DaCache: Memory Divergence-Aware GPU Cache Management. 29th International Conference on Supercomputing, June 2015, Newport Beach, CA. Paper.
[IPDPS'15]: Y. Wang*, H. Fu*, and W. Yu. Cracking Down MapReduce Failure Amplification through Analytics Logging and Migration. 29th IEEE International Parallel and Distributed Processing Symposium (Acceptance rate: 22%). Hyderabad, India. May 2015. Paper.
[DATE'15]: B. Wang*, Z. Liu*, X. Wang*, and W. Yu. Eliminating Intra-Warp Conflict Misses in GPU. The 18th Conference on Design Automation and Test in Europe. (Long paper. Acceptance rate: 22.4%). Grenoble, Fr. March 2015. Paper.
[TC'15]: W. Yu, Y. Wang, X. Que, C. Xu. Virtual Shuffling for Efficient Data Movement in MapReduce. IEEE Transactions on Computers. Paper.
[DISCS'15]: H. Fu, Y. Zhu, W. Yu. A Case Study of MapReduce Speculation for Failure Recovery The 2015 International Workshop on Data-Intensive Scalable Computing Systems (DISCS'15). Paper.
[DISCS'15]: L. Shi, Z. Wang, W. Yu, X. Meng. Performance Evaluation and Tuning of BioPig for Genomic Analysis. The 2015 International Workshop on Data-Intensive Scalable Computing Systems (DISCS'15). Paper.

2014

[BigData'14]: T. Wang*, S. Oral, Y. Wang*, B. Settlemyer, S. Atchley, W. Yu. BurstMem: A High-Performance Burst Buffer System for Scientific applications. 2014 IEEE Conference on Big Data (Acceptance rate: 18.5%). Washington, DC. October 2014. Paper.
[DISCS'14]: T. Wang*, K. Vasko*, Z. Liu*, H. Chen*, W. Yu. BPAR: A Bundle-Based Parallel Aggregation Framework for Decoupled I/O Execution. The 2014 International Workshop on Data-Intensive Scalable Computing Systems. New Orleans, LA. November 2014. Paper.
[Sigmetrics'14]: J. Tan, Y. Wang, W. Yu, L. Zhang. Non-work-conserving effects in MapReduce: Diffusion Limit and Criticality. ACM SigMetrics 2014 (Acceptance rate: 17%). Austin, TX. June 2014. Paper.
[IPDPS'14]: Y. Wang, R. Goldstone, W. Yu, T. Wang. Characterization and Optimization of Memory-Resident MapReduce on HPC Systems. 28th IEEE International Parallel and Distributed Processing Symposium (Acceptance rate: 21%). Tucson, AZ. May 2014. Paper.
[TPDS'14]: W. Yu, Y. Wang and X. Que. Design and Evaluation of Network-Levitated Merge for Hadoop Acceleration. IEEE Transactions on Parallel and Distributed Computing. Paper.
[CPE'14]: G. F. Lofstead, Q. Liu, J. Logan, Y. Tian, H. Abbasi, N. Podhorszki, J. Y. Choi, S. Klasky, R. Tchoua, R. A. Oldfield, M. Parashar, N. Samatova, K. Schwan, A. Shoshani, M. Wolf, K. Wu, W. Yu. Hello ADIOS: The Challenges and Lessons of Developing Leadership Class I/O Frameworks. Concurrency and Computation: Practice and Experience, John Wiley and Sons. Paper.

2013

[SC'13]: X. Li, Y. Wang, Y. Jiao, C. Xu, W. Yu. CooMR: Cross-Task Coordination for Efficient Data Management in MapReduce Programs. Li and Wang contributed equally to the paper. Denver, CO. (Acceptance Rate: 20%). November 2013. Paper.
[PACT'13]: B. Wang, B. Wu, D. Li, X. Shen, W. Yu, Y. Jiao, J. Vetter. Exploring Hybrid Memory for GPU Energy Efficiency through Software-Hardware Co-Design. The 22nd International Conference on Parallel Architecture and Compilation Techniques (PACT'13). (Acceptance Rate: 17%). September 2013. Edinburgh, Scotland. Paper.
[Cluster'13]: Z. Liu, J. Lofstead, T. Wang, W. Yu. A Case of System-Wide Power Management for Scientific Applications. IEEE International Conference on Cluster Computing. (Acceptance rate: 31%). Indiannapolis, IN. September 2013. Paper.
[MASCOTS'13]: B. Wang, Y. Jiao, W. Yu, X. Shen, D. Li, J. Vetter. A Versatile Performance and Energy Simulation Tool for Composite GPU Global Memory. Short paper. August 2013. San Francisco, CA. Paper.
[WBDB'13]: Y. Wang, Y. Jiao, C. Xu, X. Li, T. Wang, X. Que, C. Cira, B. Wang, Z. Liu, B. Bailey, W. Yu. Assessing the Performance Impact of High-Speed Interconnects on MapReduce. Invited paper to the Proceedings of Workshop on Big Data Benchmarks. 2013. Paper.
[ICAC'13]: Y. Wang, J. Tan, W. Yu, L. Zhang, X. Meng. Preemptive ReduceTask Scheduling for Fair and Fast Job Completion. 10th International Conference on Autonomic Computing (ICAC'13). (Acceptance Rate: 22%). June 2013. Paper.
[MSST'13]: Y. Tian, Z. Liu, S. Klasky, B. Wang, H. Abbasi, S. Zhou, N. Podhorszki, T. Clune, J. Logan and W. Yu. A Lightweight I/O Scheme to Facilitate Spatial and Temporal Queries of Scientific Data Analytics. IEEE Symposium on Massive Storage Systems and Technologies (MSST'13). (Acceptance Rate: 13%). May 2013. Paper.
[NAS'13]: Yuan Tian, Scott Klasky, Weikuan Yu, Bin Wang, Hasan Abbasi, Norbert Podhorszki, Ray Grout. DynaM: Dynamic Multiresolution Data Representation for Large-Scale Scientific Analysis. NAS 2013. Xi'an, China. July 2013. Paper.
[ICCCN'13]: Zhuo Liu, Bin Wang, Teng Wang, Yuan Tian, Cong Xu, Yandong Wang, Weikuan Yu, Carlos A. Cruz, Shujia Zhou, Tom Clune, Scott Klasky. Profiling and Improving I/O Performance of a Large-Scale Climate Scientific Application. ICCCN 2013. Paper.
[JCC'13]: Yuan Tian, Cong Xu, Weikuan Yu, Jeffrey S. Vetter, Scott Klasky, Honggao Liu, Saad Biaz. neCODEC: nearline data compression for scientific applications. Journal of Cluster Computing. April 2013. Paper.
[CCGrid'13]: C. Xu, R. Graham, M. Venkata, Y. Wang, Z. Liu, and W. Yu. SLOAVx: Scalable LOgarithmic AlltoallV Algorithm for Hierarchical Multicore Systems. International Conference on Cluster Cloud and Grid Computing. (CCGrid'13). (Acceptance Rate: 22%). May 2013. Delft, Netherland. Paper.
[IPDPS'13]: Y. Wang, C. Xu, X. Li and W. Yu. JVM-Bypass for Efficient Hadoop Shuffling. International Parallel and Distributed Processing Symposium (IPDPS'13). (Acceptance Rate: 21%). May 2013. Boston, MA. Paper.
[IPDPS'13 PhD Forum]: Bin Wang, Weikuan Yu. Performance and Power Simulation for Versatile GPGPU GlobalMemory. 2013 IPDPS PhD Forum. Boston, MA.

2012

[First Place, 2012 ACM SRC Grand Finals]: Y. Tian and W. Yu. Smart-IO: System-Aware Two-Level Data Organization for Efficient Scientific Analytics. 2012 ACM Grand Finals Student Research Competition. ACM Awards Ceremony, San Francisco, CA. June 2012. [ACM SRC Website] .
[HPDC'12]: Yuan Tian, Scott Klasky, Weikuan Yu, Hasan Abbasi, Bin Wang, Norbert Podhorszki, Ray W. Grout, Matthew Wolf. A system-aware optimized data organization for efficient scientific analytics. Poster. HPDC 2012. (Acceptance Rate: 23% combined with full papers). 125-126. Paper.
[SC'12]: D. Li, J.S. Vetter, and W. Yu. Classifying Soft Error Vulnerabilities in Extreme-Scale Scientific Applications Using a Binary Instrumentation Tool. SC'12. (Acceptance Rate: 21%). Salt Lake City, 2012. Paper.
[CEE'12]: Y. Tian*, W. Yu, J.S. Vetter. RXIO: Design and Implementation of High Performance RDMA-capable GridFTP. Journal of Computers and Electrical Engineering. Vol 38 (2012) 772-784.
Paper.
[IJPP'12]: V. Tipparaju, E. Apra, W. Yu, Xinyu Que, J.S. Vetter. Runtime Techniques to Enable a Highly-Scalable Global Address Space Model for Petascale Computing. International Journal of Parallel Programming. 2012.
Paper.
[JPDC'12]: W. Yu., X. Que, V. Tipparaju, and J.S. Vetter. HiCOO: Hierarchical Cooperation for Scalable Communication in Global Address Space Programming Models on Cray XT Systems Journal of Parallel and Distributed Computing (JPDC). 2012. Paper.
[IPDPS'12]: D. Li, J.S. Vetter, G. Marin, C. McCurdy, C. Cira, Z. Liu, and W. Yu. Identifying Opportunities for Byte-Addressable Non-Volatile Memory in Extreme-Scale Scientific Applications. International Parallel and Distributed Processing Symposium (IPDPS'12). (Acceptance Rate: 21%). May 2012. Shanghai, China. Paper.
[MASCOTS'12]: Y. Tian, S. Klasky, W. Yu, H. Abbasi, B. Wang, N. Podhorszki, R. Grout, M. Wolf. SMART-IO: SysteM-AwaRe Two-Level Data Organization for Efficient Scientific Analytics. MASCOTS 2012. 181-188. Paper.
[MASCOTS'12]: Z. Liu, B. Wang, P. Carpenter, D. Li, J.S. Vetter, W. Yu. PCM-Based Durable Write Cache for Fast Disk I/O. MASCOTS 2012. 451-458. Paper.

2011

[SC'11]: Y. Wang, X. Que, W. Yu, D. Goldenberg, D. Sehgal. Hadoop Acceleration through Network Levitated Merging. SC11. (Acceptance Rate: 21%). Seattle, WA. Paper, Project Website, Code Download.
[ICPP'11:] Weikuan Yu, Vinod Tipparaju, Xinyu Que, Jeffrey S. Vetter. Virtual Topologies for Scalable Resource Management and Contention Attenuation in a Global Address Space Model on the Cray XT5. ICPP 2011: 235-244. Paper.
[Cluster'11:] Yuan Tian, Scott Klasky, Hasan Abbasi, Jay F. Lofstead, Ray W. Grout, Norbert Podhorszki, Qing Liu, Yandong Wang, Weikuan Yu. EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization. CLUSTER 2011: 93-102. Paper.
[Cluster'11:] Weikuan Yu, K. John Wu, Wei-Shinn Ku, Cong Xu, Juan Gao. BMF: Bitmapped Mass Fingerprinting for Fast Protein Identification. CLUSTER 2011: 17-25 Paper.
[CCGrid'11:] Xinyu Que*, Weikuan Yu, Vinod Tipparaju, Jeffrey S. Vetter, Bin Wang. Network-Friendly One-Sided Communication through Multinode Cooperation on Petascale Cray XT5 Systems. CCGRID 2011: 352-361. Paper.
[ERSS'11:] Z. Liu*, J. Zhou, W. Yu, F. Wu, X. Qin and C. Xie. MIND: A Black-Box Energy Consumption Model for Disk Arrays. In 1st International Workshop on Energy Consumption and Reliability of Storage Systems (ERSS'11). Orlando, Florida, July 2011. Paper.

2010

[CF'10]: V. Tipparaju, E. Apra, W. Yu, J.S. Vetter. Enabling a highly-scalable global address space model for petascale computing. International Conference on Computing Frontiers. (Acceptance Rate: 25%). Bertinoro, Italy. May 2010. Paper.
[ISC'10]. Weikuan Yu, Xinyu Que, Vinod Tipparaju, Richard L. Graham, Jeffrey S. Vetter. Cooperative server clustering for a scalable GAS model on petascale cray XT5 systems. Computer Science - R&D 25(1-2): 57-64 (2010). Paper.

2009

[IPDPS'09]: W. Yu, O. Drokin, J.S. Vetter. Design, Implementation, and Evaluation of Transparent pNFS on Lustre. 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS'09). (Acceptance Rate: 23%). Rome, Italy.
Paper.

2008

[SC'08]: N.S.V. Rao, W. Yu, S.W. Poole, W.R. Wing, J.S. Vetter. Wide-Area Performance Profiling of 10GigE and InfiniBand Technologies. SC08. (Acceptance Rate: 21%). Nov 2008. Austin, TX.
Paper.
[SC'08]: S. Alam, R. Barrett, M. Bast, M. R. Fahey, J. Kuehn, C. McCurdy, J. Rogers, P. Roth, R. Sankaran, J. S. Vetter, P. Worley, W. Yu. Early Evaluation of IBM BlueGene/P, SC08. (Acceptance Rate: 21%). Nov 2008. Austin, TX. Paper.
[IPDPS'08]: W. Yu, J.S. Vetter, Sarp Oral. Performance Characterization and Optimization of Parallel I/O on the Cray XT. IPDPS 2008. (Acceptance Rate: 26%). April 2008. Miami, FL. Paper.
[ICPP'08]: W. Yu, J.S. Vetter: ParColl: Partitioned Collective I/O on the Cray XT. International Conference on Parallel Processing (ICPP'08). (Acceptance Rate: 31%). Portland, OR. Paper.
[CCGrid'08]: Weikuan Yu, Jeffrey S. Vetter. Xen-Based HPC: A Parallel I/O Perspective. CCGRID 2008. Paper.
[NAS'08]: Weikuan Yu, Nageswara S. V. Rao, Jeffrey S. Vetter. Experimental Analysis of InfiniBand Transport Services on WAN. NAS 2008. Chongqing, China. Paper.
[EuroPar'08]: Weikuan Yu, Sarp Oral, Shane Canon, Jeffrey S. Vetter, Ramanan Sankaran. Empirical Analysis of a Large-Scale Hierarchical Storage System. Euro-Par 2008: 130-140. Canary Islands, Spain. Paper.

2007

[CCGrid'07:] Weikuan Yu, Jeffrey S. Vetter, Shane Canon, Song Jiang: Exploiting Lustre File Joining for Effective Collective IO. CCGRID 2007. Paper.
[ICPP'07:] Feng Chen, Song Jiang, Weisong Shi, Weikuan Yu: FlexFetch: A History-Aware Scheme for I/O Energy Saving in Mobile Computing. ICPP 2007. Paper.

2006 and Before

[ICPP'06:] Shuang Liang, Weikuan Yu, Dhabaleswar K. Panda: High Performance Block I/O for Global File System (GFS) with InfiniBand RDMA. ICPP 2006: 391-398. Paper.
[ICPP'06:] Qi Gao, Weikuan Yu, Wei Huang, Dhabaleswar K. Panda: Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand. ICPP 2006. Paper.
[IPDPS'06:] Weikuan Yu, Qi Gao, Dhabaleswar K. Panda: Adaptive connection management for scalable MPI over InfiniBand. IPDPS 2006. Paper.
[IPDPS'05:] Weikuan Yu, Timothy S. Woodall, Richard L. Graham, Dhabaleswar K. Panda: Design and Implementation of Open MPI over Quadrics/Elan4. IPDPS 2005. Paper.
[ICS'05:] Weikuan Yu, Shuang Liang, Dhabaleswar K. Panda: High performance support of parallel virtual file system (PVFS2) over Quadrics. ICS 2005. Paper.
[Cluster'05:] Pavan Balaji, Wu-chun Feng, Qi Gao, Ranjit Noronha, Weikuan Yu, Dhabaleswar K. Panda. Head-to-TOE Evaluation of High-Performance Sockets over Protocol Offload Engines. CLUSTER 2005. Paper.
[IJHPCA'05:] Weikuan Yu, Sayantan Sur, Dhabaleswar K. Panda, Rob T. Aulwes, Richard L. Graham. High Performance Broadcast Support in La-Mpi Over Quadrics. IJHPCA 19(4): 453-463 (2005). Paper.
[Cluster'04:] Weikuan Yu, Dhabaleswar K. Panda, Darius Buntinas. Scalable, high-performance NIC-based all-to-all broadcast over Myrinet/GM. CLUSTER 2004. Paper.
[IEEE Micro'04:] Jiuxing Liu, B. Chandrasekaran, Weikuan Yu, Jiesheng Wu, Darius Buntinas, Sushmitha P. Kini, Dhabaleswar K. Panda, Pete Wyckoff: Microbenchmark Performance Comparison of High-Speed Cluster Interconnects. IEEE Micro 24(1): 42-51 (2004). Paper.
[HiPC'04:] Weikuan Yu, Jiesheng Wu, Dhabaleswar K. Panda. Fast and Scalable Startup of MPI Programs in InfiniBand Clusters. HiPC 2004. Paper.
[ICPP'03:] Weikuan Yu, Darius Buntinas, Dhabaleswar K. Panda. High Performance and Reliable NIC-Based Multicast over Myrinet/GM-2. ICPP 2003. Paper.
[HotI'03:] Jiuxing Liu , Balasubramanian Chandrasekaran , Weikuan Yu , Jiesheng Wu , Darius Buntinas , Sushmitha Kini, Peter Wyckoff, Dhabaleswar K. Panda. Micro-Benchmark Level Performance Comparison of High-Speed Cluster Interconnects. Hot Interconnect 2003. Stanford, CA. Paper.
[SC'03:] Jiuxing Liu, B. Chandrasekaran, Jiesheng Wu, Weihang Jiang, Sushmitha P. Kini, Weikuan Yu, Darius Buntinas, Pete Wyckoff, Dhabaleswar K. Panda: Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics. SC 2003. Paper.

[Complete List] from the PASL Publication Database.