(→Cheaha Cluster Access: Fix URL markup for Cheaha_GettingStarted)
(→Web Tools: Answer question about wiki formatting.)
|Line 275:||Line 275:|
== Web Tools ==
== Web Tools ==
== Misc ==
== Misc ==
Revision as of 12:03, 27 July 2011
A FAQ for things you might like to know
What type of networking is used on campus?
The campus network is an Ethernet packet-based network.
What is Ethernet?
Ethernet is a family of packet-based [[wikipedia:computer network]|computer networking]ing technologies for local area and wide area networks (LANs and WANs). Most laptops, desktop computers, server computers, cable modems and DSL modems have a built-in support for Ethernet networks. For more information and history, read the Wikipedia entry on Ethernet.
(Credits Wikipedia:Ethernet April 08, 2011)
What is the recommended configuration for a researcher's network connection?
It depends on the work that you do. If your work frequently involves moving data sets to and from your computer for visualization, analysis, or collaboration, you should seriously consider a 100Mbs full-duplex network connection as your baseline.
What the difference between Mbs and MBs?
"Mbs" stands for "megabits per second". "MBs" stands for "megabytes per second". A lower-case "b" designates bits (1's and 0's) and an upper-case "B" designates bytes. 1 byte equals 8 bits.
Bits are used to measure network data transfer rates in seconds and bytes are used to measure data storage sizes. When stored data is moved across a network, however, it is convenient to consider transfer times measured in the number of bytes of stored data moved in one second.
What do 10Mbs, 100Mbs, and 1Gbs mean?
Network speeds are listed by the number of bits (1's and 0's) they can transfer in one second. Modern networks transfer millions of bits per second, designated "Mbs" and read "mega-bits per second". Common network speeds are 10Mbs, 100Mbs, and 1000Mbs. 1000 megabits are equal to 1 gigabit, and 1000Mbs is typically written "1Gbs" and read "one gigabit per second" (1 billion bits per second).
How fast are 10Mbs, 100Mbs, and 1Gbs networks?
To get a sense for the performance of different network speeds, it's easiest to use the following rules of thumb for comparing network speeds to data set sizes and their transfer time:
- 10Mbs can transfer 1MBs
- 100Mbs can transfer 10MBs
- 1000Mbs (1Gbs) can transfer 100MBs
A CDROM can hold 700MB of data. Transferring this much data would take about 7 seconds on a 1Gbs network, 70 seconds (more than 1 minute) to transfer on a 100Mbs network, and 700 seconds (more than 10 minutes) to transfer on a 10Mbs network.
What's the justification for this transfer rate rule of thumb?
The logic for this metric is that a 10Mbs (10 mega-bit per second) network connection will move 10 million bits per second. Data is measured in 8-bit bytes and the rule of thumb for Ethernet is that performance peaks at 80% capacity. This provides the easy conversion factor of 10Mbs=1MBs. Note that the lower-case "b" means "bits" and upper-case "B" means bytes, ie. 8 bits. The network speeds scale up easily by factors of 10. So 100 megabit per second connection is capable of transferring 10 megabytes per second, and a 1000 megabit per second is capable of transferring 100 megabytes per second.
Theoretically, a 100Mbs connection will transfer 100 million bits in one second, or about 10 megabytes (MB) per second. This means you would be able to transfer a CD's worth of data (about 700MB) in about 70 seconds, about 1 minute. (Compare this to a 10x slower connection of 10Mbs and it would take 700 seconds
How much network bandwidth is available is available on campus?
Individual network connections at 10Mbs, 100Mbs, or 1Gbs speeds can be delivered to any location on the campus network at standard rates. Additionally, wireless network connectivity is available across campus.
What does the campus network look like?
The campus network can be visualized as a collection of network trees, roughly one per building, with the root of each tree connecting to an expandable high bandwidth core network backplane (currently running at 10Gbs).
The depth of each individual tree is determined by the physical layout of and number of network ports in each building. Each tree is typically no more than three layers deep, including the leaf nodes. The leaf nodes are the end-user connections, i.e. wired wall ports or wifi connections. The internal nodes of each tree are network switches and the switches are connected to the next layer via fast connections (currently running at 1Gbs).
Each tree (each building) connects to the core network backplane via a fast connection (currently running at 1Gbs). At this core network connection, the data packets are routed to their final destination on- or off-campus.
How is the campus network connected to off-campus networks?
The campus core network backplane is connected to off-campus networks like the commercial Internet (Google, Facebook, Amazon) and national high bandwidth research networks (Internet2 and NLR) which provide high speed connections to research institutions and labs across the country. The fastest network route to a specific off-campus destination is chosen automatically as the network packets move off-campus.
Custom configurations to meet unique research needs or specific performance targets can be designed. This requires advanced planning and an understanding of the proposed research workloads and workflow. Please contact Research Computing. The cost for these customizations can often be included in research proposals.
How do I order or upgrade a network connection?
To place an order you will need to provide a general ledger account number for billing and identify the location (building address) of the service request. The wall-jack identification number for the network connection will be needed to complete the service request and can be entered on the form.
If you have questions please contact UABCOMM@uab.edu or call 4-0503.
Who pays for my network connection?
Network connections are accounted for via a federally regulated service center run by UAB IT. The rates are set based on the cost to deliver the service. Money to pay for network connectivity can come from any legitimate source: directly through grants, indirect grant funds routed to departments, or other departmental or research support funds.
How much do network connections cost?
Standard service center rates apply to all network connections (10Mbs, 100Mbs, and 1Gbs). Discounted rates for upgrading existing connections to higher data rates are available. Additionally, network switches can be ordered at a fixed lease rate to supply many network connections to an area.
Please contact UAB IT Telecommunications for rates at UABCOMM@uab.edu or call 4-0503.
How do I measure my campus network connection speed?
The UAB IT SpeedTest server speedtest.dpo.uab.edu will run a data transfer test from your computer to the SpeedTest server and rate the performance of a data connection.
How do I measure my network bandwidth from my computer to Cheaha?
You can test the data transfer performance between your computer and Cheaha by using iperf. To run an iperf test you will need to install iperf on your desktop. Iperf is readily available on Windows, Mac, and Linux. It is already installed on Cheaha.
To run a 30 second data transfer test moving data from Cheaha to your computer the iperf test:
- Start iperf from a command shell on you desktop in "server" mode
iperf -s -i 1
- Log into your Cheaha account
- Start iperf from the command shell on Cheaha in "client" mode
/opt/iperf/bin/iperf -c <ip-of-you-computer> -t 30 -i 1
The iperf program output on Cheaha will update a data transfer rate to your computer every second for 30 seconds. If this data rate is not what you expect or confusing, please send an email with the output from both command windows to [email:email@example.com]
Note: This test requires that your computer have a public IP address that can be accessed by Cheaha
Note: The iperf server listens on port 5001 by default. Your local desktop (iperf server here) should allow incoming connections at this port from cheaha. Following iptables command will append a rule to open port 5001 for incoming tcp connections from cheaha. Consult your local system/network administrator before adding this rule.
# sudo /sbin/iptables -A -p tcp -s 18.104.22.168 --dport 5001 -j ACCEPT
Note: Iperf will currently only test your speed for data transferred from Cheaha to your client. This should provide a reasonable estimate for data transferred to Cheaha as well.
What factors impact the actual speeds I can expect in the real world?
The actual transfer rates you get depend on three factors: software, hardware, and other users.
Data transfer software and computer hardware can significantly impact real world transfer rates. If you are transferring lots of data, you will see your best performance with software that can keep the network full, computer hardware that is not slower than the data network, and a network connection sized for your data sets and patience.
How does my copying software impact my transfer speeds?
The software you use to transfer data is the most import factor in maximizing data throughput. Most traditional copy methods move data in a single-file line. Modern computer hardware hides this software inefficiency and can easily keep a 10Mbs connection full and can do ok with a 100Mbs connection. If you are moving lots of data or using a 1Gbs network, you need to use software tuned for high-speed data transfer.
High speed data transfer software uses multiple single-file lines in parallel to improve network throughput. This software must be used at both ends of the data transfer in order coordinate the parallel transfer streams. You won't get very far if you are smart but your peer is not.
What high-speed data transfer software can keep up?
It's important to use improved data transfer software that can move data . (Post examples)
How does my computer hardware impact my transfer speeds?
Computer hardware also impacts transfer speeds. Your slowest piece of hardware will dictate your maximum data transfer rate. If you have a slow disk (you should read that as "an external USB hard drive"), you will be limited by its data transfer speeds.
Additionally, your computer may be fast but it still has to manage your workload and coordinate use of all the devices in your computer, including the network connection. If you are crunching numbers or doing heavy visualizations at the same time you are trying to transfer data, your computer may not be able to keep up. Note, that this scenario is common when you are reading data for your visualization off a file server. Sometimes you need to move your data before you can use it.
How do I measure my off-campus network connection speed?
The SpeedTest.net service can be used to measure your connection to key points on the Internet. To run this test, choose the Atlanta, GA connection point. This will run a data transfer test from your computer, off-campus to the SpeedTest.net server hosted by Comcast in Atlanta, GA. The test will rate the performance of a data transfer.
Atlanta is a good test destination because this is where UAB's Internet-bound traffic actually connects to the commodity Internet. This test will show the network performance to our nearest off-campus neighbor. If you want to share the results of this test with others, please be sure to click the "Share this Test" and then "Copy" buttons. This will provide you a URL to a PNG image capturing the results of this test that anyone can load in their browser.
What factors impact my off-campus network connection speed?
It is important to understand that Internet traffic speeds are highly variable. Transfer speed depends heavily on the network capacity and use along the entire path from your desktop to the location with which you are exchanging data. It also depends on the capabilities of your desktop and the server that is the target of your data transfer. If the networks or remote sites are overloaded or have insufficient bandwidth, then your data transfer speeds will be limited by those conditions.
As an example, you can try a speed test to a network destination other than Atlanta, GA or a speed test hosted by a network provider other than Comcast. The spead tests from Ookla.net and Speakeasy.net may show different performance for the selected destinations. You may also find the information at SpeedTest.org informative.
Is there storage space for research data?
The rapidly growing demand for research storage is clearly recognized. Solutions for hosting research data are under active development (and funding discussions) as part of the UABgrid Pilot. Currently, research storage is only available through the traditional compute cluster interface of Cheaha.
How can I contribute to the development of research storage?
The best way to contribute to the development of research storage is to share your storage requirements. It will be helpful if you can share information on the following topics with an email to [email:firstname.lastname@example.org]:
- How much data do you currently store?
- How are you solving your research data problem today?
- How much do you expect your data to grow in the next year?
- Are you building an analysis pipeline that has known storage expectations?
- Do you need to archive your data? How long?
- Do you need to keep all your data on-line?
- Do you ever delete your data?
- How expensive is it for you to recreate derived data products?
How can I use the existing research storage on Cheaha?
The generally available research storage on the cluster is designated to support storage requirements for the construction of data analysis pipelines where data needs to be shared by multiple users on the cluster. To request such storage please send an email to [email:email@example.com].
The pilot research storage on the cluster is being developed to support the much broader use case of data sharing in collaborations. If you are interested in participating in this pilot and please send a use case and justification of your project to [email:firstname.lastname@example.org].
What best-practices exist for storing my research data?
There are many solutions for storing your research data. Simply keeping it on your desktop is one option. As data grows it is often necessary to move it off your system. Most people find some form of USB Drive to be an acceptable solution. One solution that has become popular is the use of DroboFS.
Note: No endorsements are made of any product of the fitness of any solution.
Cheaha Cluster Access
How do I get an account to use cluster computing on Cheaha?
Please send an email to the support group requesting an account. Include you UAB BlazerID and some information about which group you are a part of here on campus and what your plans are for using the cluster.
How do I get started using the cluster after I have an account?
A basic getting started guide is available and should answer questions about how to log in to Cheaha and submit a batch job.
How do I cut-and-paste into a terminal window, ctrl+c always exits my commands?
Using a terminal window for an SSH session from your desktop, you can cut-n-paste into that terminal window from your desktop, eg. you may want to copy the example job commands in the getting started guide. The exact key combination varies depending on the terminal program you use but it is often Shift+Ctrl+C. On Mac's, the normal command+c keystroke often works since it doesn't not generate the ctrl+c character sequence.
Why is the wiki markup syntax different between my project space and the docs wiki?
The "Projects" wikis are implemented using a tool called Trac and follow a formatting convention popularized by earlier wikis mainly MoinMoin. The "Docs" wiki is implemented using a tool called MediaWiki and follows a formatting convention popularized by Wikipedia. Because these communities have focused on addressing specific use cases, software developers in the case of Trac and document writers in the case of Mediawiki, there formatting conventions have differ significantly in their details.
Section heading markup (using '=' to designate section headings) and external urls (typing in a bare URL like http://google.com) are typically portable between the two wikis, but details like table layout vary widely.
An easy option is to leave pages in place and reference them by name from the Projects or Docs wikis.
UABgrid is an infrastructure pilot of UAB IT Research Computing. More information can be found in the UABgrid FAQ though this information may be dated.