Best Practices, Performance Tuning and Benchmarkingversion 1.0.3 [2009-07-04]
Architecture methologies
- Current state » Future state » gap analysis » roadmap » business case

Capacity plan
- Pitfall: using favorable vendor estimates, while practical values are (much) lower.

- Acceptance criteria are primarily used to minimize the project risks

Total Cost of Ownership (TCO)
- When determining TCO of an existing customer, things like physical floor space, staff management, software license costs are important.
Storage benchmarking
  • Originally created by Linus Torvalds, in 1991
  • Version format: A.B.C, where A is kernel version, B is major version, C is minor version
Hardware disposal
  • Clear configuration
  • Remove sensitive data (password, network configuration, custom files, VLAN database information)
Differences between NFS version 3 and 4
 NFS v3NFS v4
Port2049 UDP or TCP2049 TCP only
Services needed
  • portmapper
  • rpc.lockd
  • rpc.mountd
  • rpc.nfsd
  • rpc.quotad
  • rpc.idmapd
  • rpc.mountd
  • rpc.nfsd
Version advantagesCommon, often used
  • Firewall friendly
  • Improved cache management
  • non Unix compability (Windows)
  • Good crash recovery
  • Advanced/extended security

Network autonegotiation
When autonegotiation for network hardware was first introduced, it had compability issues with other vendors. This has led to the believe that network administrators should set their settings for port speed and duplex mode to a fixed value. However, best practice is using auto negotiation for all devices, unless problems are visible (low performance). For everything which has a 1 gigabit or higher speed, auto negotiation is the best default choice. For 100 Mbit connections is depends on the vendor and used network components.
  • Ability to detecct bad cables and link failures
  • Detect partners capabilities (like PHY type)
  • Flow control functionality (like Pause Frames)
Performance Tuning
Fibre Channel
- When using long distance FC (usually at 1 Gbps), buffer-to-buffer credits need to be adjusted if the distance is longer than 10 km.

- Uses WAFL to start data, manages disks, arranging data and the file system within WAFL has a fixed block size of 4 kB.

- Creating/designing a SAN for an enterprise level customer takes usually between weeks and several months and needs careful planning.

SAN architecture
- Decribes high-level processes, SAN implementation guidelines, product independant information.

SAN backup strategies
- SAN backup can make backup management easier and provides a better backup resource utilization. Usually performance will be better as well, compared with LAN based backups.
- A dedicated HBA can a good option when using multipathing, to increase performance and make management easier.

SAN extending
Goal: interconnect data centers
- Biggest challenge is keeping availability and speed (which is part of network latency).
- Brocade switches need a specific license for increasing buffer-to-buffer credits. McData switches do not need it.

- When connecting two fabrics with different PID settings, a router should be used.

SAN failover
- Recommended methods to failover a single host to a different HBA (when using multipathing) is using MP software to make the switch, or disable the specific port on the switch.

- Inter Switch links can be used to attach switches to eachother. Keep in mind every ISL costs 2 ports.
- Oversubscription is common to be used on ISL's, but should be monitored. Overloading ISL's will increase latency or even can cause SCSI timeout errors.
- For low applications IO profiles, a 15:1 oversubscription can be used. For normal IO profiles 7:1.
- Replacing existing ISL's should be planned carefully. It triggers RCF (Re-Configure Fabric) or BF (Build Fabric) frames. With RCF frames user data connections will temporarily stop.

SAN limitations
- Cisco SAN's should not exceed 3 hops (which is a verified/certified value).

SAN migrations
Points of interest:
- data locality
- application IO profiles

Ring to core/edge: can be done without disruption. New paths are automatically used, depending on FSPF (Fabric Shortest Path First).

SAN monitoring
- B-series can monitor zone changes, topology reconfiguration and SFP insertions/removals, when using Fabric Manager.
- Monitor usage of ISL's and extended SAN parts (via FCIP, iFCP etc).

SAN policies
Data management: creating the guidelines, procedures, processes and plans for classifying, storing, moving and archiving of data.

SAN replication
- Use synchronous replication when data has to be the same on more than one site. Since other side has to acknowledge data, application latency can be introduced and need to be properly tested.
- Use asynchronous replication when application performance is more important than a zero RPO.
- When inter-site link goes down with a replicated situation, use a history log/transaction log on the local site.

SAN security
- Routine periodic security scans
- Physical security, protecting the SAN storage arrays, tape devices, switches and cabling.
- Switch binding can be used to tie a device to a switch port. When activating the option, usually only new devices will have to be registered to become operational.

SAN switches
- Changing Domain ID and Core PID is usually disruptive.
- Configuration can be backed up from management tool, or by using remote access protocol (telnet/SSH) in combination with for example FTP and RSHD. Configuration backups can be different, depending on the vendor choice.
- Switch configuration restores usually have to happen when the switch is in 'offline' mode. Switch management information (like IP address) need to set by hand.
- Core PID 0 (Native Mode) --> 4 bits, maximum 16 ports
- Core PID 1 (Core Mode) --> 8 bits, maximum 256 ports
- Core PID 2 (Extended Edge), same as native, but with maximum 128 ports. Mode is used when connecting to a core PID 1 switch in which systems can not be rebooted at that time.

- Brocade switches have a limitation in the zone database size. Usually the oldest (with the smallest database size) determines who much nodes can be present in the fabric.

SAN topologies
- Cascaded and ring topology have use one-to-one relations.
- Core/edge gives the ability to tier devices and place devices close to eachother (i.e. same switch) for optimal performance.
- Core/edge topology is usually a safe option to use, when creating a design where not all information is available.
- Backbone topology is suitable for many-to-many relations and can be used for SAN's in which traffic patterns are not known.

Optimal number of switches:
Cascade: 2-3
Cascade ring: 3-5
Full mesh: 4-8
Partial mesh: 4-8
Core/edge: more than 5

SCSI devices
- High Voltage Differential (HVD) and Low Voltage Differential (LVD) devices can not be mixed and get most likely damaged.
- Within a SCSI bus, both sides need to be terminated with a dedicated terminator or auto terminator.

SCSI cabling
Maximum cable length of LVD SCSI bus: 12 meter

Security benefits
  • Helps minimizing losing data, sensitive information, people, knowledge
  • Improves future designs, software, infrastruce and common knowledge
  • Gives a better trust in information for SA's, end users, customers and stake holders
Security can be compared with an insurance. It costs money, but can minimize real damage when problems occur.

Copyright Michael Boelen -
Creative Commons License Valid XHTML 1.0 Transitional
This work is licensed under a Creative Commons License (Attribution-Noncommercial-No Derivative Works 3.0 Unported).