User Tools

Site Tools


chara:trouble_shooting

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
chara:trouble_shooting [2021/11/09 15:30]
gail_stargazer
chara:trouble_shooting [2022/01/21 20:03]
charaobs
Line 27: Line 27:
   * Check the instrument alignment. Is flux getting through to the detector? How long has it been since the last NIRO camera alignment? Classic and CLIMB programs can run for about an hour before the light will drift from the central pixel. Use the Classic or CLIMB gui to view the light on the pixels by clicking the PICTURE tab and then the PIXEL AREA button. Turn the camera off with the STOP button. Is the right dither power turned on? CLIMB 1 and Classic use different dithers. If Classic or CLIMB fringes are found in a scan, but not when in recording mode, the dither powers are likely not on. Are the camera settings correct for the seeing conditions and flux levels?   * Check the instrument alignment. Is flux getting through to the detector? How long has it been since the last NIRO camera alignment? Classic and CLIMB programs can run for about an hour before the light will drift from the central pixel. Use the Classic or CLIMB gui to view the light on the pixels by clicking the PICTURE tab and then the PIXEL AREA button. Turn the camera off with the STOP button. Is the right dither power turned on? CLIMB 1 and Classic use different dithers. If Classic or CLIMB fringes are found in a scan, but not when in recording mode, the dither powers are likely not on. Are the camera settings correct for the seeing conditions and flux levels?
  
-===== Restarting Servers =====+===== The new OPLE system =====
  
-==== Restarting Servers using the bootlaunch paradigm ====+With the implementation of the new ople system which replaced the VME in Fall of 2021, new troubleshooting issues have arisen. Since there are now 6 new ople computers to run the carts for each scope individually, many issues will be specific to one computer or telescope and not to the system as a whole. Restarting the VME is now a thing of the past.
  
-If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list".+The traditional ople server will still be used to communicate with each new ople computer, identified as OPLE 1 to OPLE 6. When the communications are good, each active cart will be displayed in the ople server or ople gui status tab as beforeAt times, an ople computer can lose communications or server can crash and the server or comms needs to be restarted.
  
-To restart serverlog on to the machine that runs the server and type "bootlaunch_master"This script will go through the list of executables and will check which servers are running. If a server isn't running it bootlaunch_master will remove it from socket manager, clear the lock file, and relaunch the server. The instructions below describe how to restart individual servers, but this should not be necessary anymore\\  \\ A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have number of safeties built in, so it is safe to run them even if server is already running – they just output the process ID of the running serverThe scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writeableso one should be able to run a bootlaunch script as observe. \\  \\ One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunchChief among these is the output of "tsockman". If a server stopped unexpectedlyit may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is thereTo do that, "tsockman remove <entry>is called to remove "<entry>" before the new server is launchedIf there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an errorThe server in question launched (without fanfare) right after that output text. \\  \\ Here are the available bootlaunch scripts as of June 2017: \\  \\ gps computer:+If cart cannot be startedstopped or otherwise commanded, look to the OPLESystem gui to see if a green indicator has turned redA message will also often pop up on the ople gui saying the command could not be sentThis may require a simple start command to restart the server or a reboot of the computer to get it back to yellow and then start to get it back to green. Do either of these steps with right click of the red button and then select start or rebootAfter a rebootselect start to load the servers after the indicator has turned yellowWhen this is done, the ople server will need to be connected to the newly restarted serverType "oointo the ople server to open ople comms with the new ople server. It should now reappear on the ople server display and say System Ready to indicate comms are restored.
  
-  * bootlaunch_beamsamp – Starts the beam sampler serversBS1 and BS2. +Some times a cart is stopped and cannot be commanded. If the cart has gone to the front hard or back hard switchit will not be usable until it is moved from the switch and the Ople Controller box is resetThere are 6 silver boxes for these controllers with two green LED's for the front and back switches and two red LED's for the back hard and front hard switches. If a red LED is lit, there will be an error displayed on the message window and the cart is disabled. The cart will need to be moved ofo the switch and then the box can be reset with the RESET button on the front. The error display will go away and the red LED will be off. The cart is now controllable.
-  * bootlaunch_zaber – Starts the ZABER_2 server.+
  
- \\ telescope bunker computers:+===== Restarting Servers =====
  
-  * bootlaunch_hut – Starts the E1_HUT, E2_HUT, S1_HUT, S2_HUT, W1_HUT, or W2_HUT server, depending on the machine it's launched from. +==== Restarting Servers using the bootlaunch paradigm ==== 
-  * bootlaunch_rpc – Starts the RPC_E1, RPC_E2, RPC_S1, RPC_S2, RPC_W1, or RPC_W2 server, depending on the machine it's launched from. + 
-  * bootlaunch_weather – Starts the E1_WEATHER, E2_WEATHER, S1_WEATHER, S2_WEATHER, W1_WEATHER, or W2_WEATHER server, depending on the machine it's launched from. +If a server is not running or Socket Manager reports that a server is deadthen look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list".
-  * bootlaunch_lower – Starts the E1_Lower, E2_Lower, S1_Lower, S2_Lower, W1_Lower, or W2_Lower cylinder server, depending on the machine it's launched from. +
-  * bootlaunch_upper – Starts the E1_Upper, E2_Upper, S1_Upper, S2_Upper, W1_Upper, or W2_Upper server, depending on the machine it's launched from.+
  
- \\ Note: The bootlaunch scripts will not start a new server if there is an existing process running. Therefore, type "ps aux | grep //server_name//" where server_name is the name of the server. If there is a dead process running, look up the process identification number (PID) and type "kill -9 PID" to kill the process and then run the relevant bootlaunch script.+To restart a server, log on to the machine that runs the server and type "bootlaunch_master". This script will go through the list of executables and will check which servers are running. If a server isn't running it bootlaunch_master will remove it from socket manager, clear the lock file, and relaunch the server. The instructions below describe how to restart individual servers, but this should not be necessary anymore. \\  \\ A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have a number of safeties built in, so it is safe to run them even if a server is already running – they just output the process ID of the running server. The scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writeable, so one should be able to run a bootlaunch script as observe. \\  \\ One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunch. Chief among these is the output of "tsockman". If a server stopped unexpectedly, it may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is there. To do that, "tsockman remove <entry>" is called to remove "<entry>" before the new server is launched. If there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an error. The server in question launched (without fanfare) right after that output text. \\  \\ Note: The bootlaunch scripts will not start a new server if there is an existing process running. Therefore, type "ps aux | grep //server_name//" where server_name is the name of the server. If there is a dead process running, look up the process identification number (PID) and type "kill -9 PID" to kill the process and then run the relevant bootlaunch script.
  
 ==== Restarting Servers using the rc.local file ==== ==== Restarting Servers using the rc.local file ====
  
- \\ This procedure is applicable to servers that have not switched over to the bootlaunch paradigm. \\  \\ If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list" Note that servers can be running fine, but if the Socket Manager drops the connection to them, they are as good as dead when it comes to functioning with other servers or as part of a larger sequence. \\+ \\ This procedure is applicable to servers that have not switched over to the bootlaunch paradigm, which may not be any at this point. \\  \\ If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list" Note that servers can be running fine, but if the Socket Manager drops the connection to them, they are as good as dead when it comes to functioning with other servers or as part of a larger sequence. \\
  \\  \\
 Log on to the relevant computer by typing the computer name (ctrscrut, ople, s1, …). If the shortcut doesn't work then type "ssh //name//" where name is the computer name. \\  \\ Find out if the server is running by typing "ps aux | grep //server_name//" where server_name is the name of the server. \\ [ctrscrut:599] ps aux | grep pico_1 \\ observe 9281 0.0 0.0 61156 692 pts/3 S+ 13:58 0:00 grep pico_1 \\ observe 12578 0.0 0.0 24524 11212 ? S Jun16 33:14 /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg \\  \\ If the entry for the dead server shows up in the process list, then identify the process identification number (12578 for the example above) and kill the server by typing "kill -9 //PID//" where PID is the process identification number. \\  \\ Look up the commands to restart the server by typing "more /etc/rc.local" (this is relevant for servers that run in the background). Press the space bar to scroll through the contents of the rc.local file. Locate the commands relevant for the server that needs to be restarted and copy and paste into a terminal window: \\  \\ #Start PICO server for PICO #1 \\ /bin/rm -f /var/lock/LCK..ttyC8 \\ /usr/local/bin/tsockman remove PICO_1 \\ /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg & \\  \\ The first command removes the lock to allow the server to restart. The second command removes the name from the socket manager listing. The last command restarts the server. Note that if you are restarting the servers as observe, you will need to remove the part of the command in the rc.local file that saves information in /var/log///server_nam//e.log file (the actual command typed should resemble the last line above). \\  \\ There are text files on the desktop with many of the restart commands. Use these files for quick access to the relevant commands. The commands are edited and can be copied exactly as written. Files include Dome servers and all servers running on ctrscrut. Many of these commands are also located on the [[:chara:restarting_servers|Restarting Servers]] page. Log on to the relevant computer by typing the computer name (ctrscrut, ople, s1, …). If the shortcut doesn't work then type "ssh //name//" where name is the computer name. \\  \\ Find out if the server is running by typing "ps aux | grep //server_name//" where server_name is the name of the server. \\ [ctrscrut:599] ps aux | grep pico_1 \\ observe 9281 0.0 0.0 61156 692 pts/3 S+ 13:58 0:00 grep pico_1 \\ observe 12578 0.0 0.0 24524 11212 ? S Jun16 33:14 /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg \\  \\ If the entry for the dead server shows up in the process list, then identify the process identification number (12578 for the example above) and kill the server by typing "kill -9 //PID//" where PID is the process identification number. \\  \\ Look up the commands to restart the server by typing "more /etc/rc.local" (this is relevant for servers that run in the background). Press the space bar to scroll through the contents of the rc.local file. Locate the commands relevant for the server that needs to be restarted and copy and paste into a terminal window: \\  \\ #Start PICO server for PICO #1 \\ /bin/rm -f /var/lock/LCK..ttyC8 \\ /usr/local/bin/tsockman remove PICO_1 \\ /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg & \\  \\ The first command removes the lock to allow the server to restart. The second command removes the name from the socket manager listing. The last command restarts the server. Note that if you are restarting the servers as observe, you will need to remove the part of the command in the rc.local file that saves information in /var/log///server_nam//e.log file (the actual command typed should resemble the last line above). \\  \\ There are text files on the desktop with many of the restart commands. Use these files for quick access to the relevant commands. The commands are edited and can be copied exactly as written. Files include Dome servers and all servers running on ctrscrut. Many of these commands are also located on the [[:chara:restarting_servers|Restarting Servers]] page.
  
-==== Shutters Server ====+Shutters Server
  
  \\ The Shutters server can become unresponsive or disconnected from the Socket Manager. This server must be restarted from the lab and not from the Control Room. Follow these instructions to restart it. Note that Shutters runs on ople, not ctrscrut. \\  \\ To start the shutter server on ople: \\  \\ Log into the ople computer and kill the process labeled shutters with the PID as described in **Restarting Servers**  above. \\ Turn off the power to the Shutters with the switch on the computer rack which is to the left of the computer desk and marked "SHUTTERS". Restart the Shutters server with the commands below. After restarting the server and testing the gui to see that it works, turn the SHUTTERS power back on with the switch. There is a printed sheet of directions in the lab to help you. \\  \\ /usr/local/bin/tsockman rm shutters \\ ctrscrut/usr/local/bin/shutter_server /ctrscrut/chara/etc/shutter.cfg &  \\ The Shutters server can become unresponsive or disconnected from the Socket Manager. This server must be restarted from the lab and not from the Control Room. Follow these instructions to restart it. Note that Shutters runs on ople, not ctrscrut. \\  \\ To start the shutter server on ople: \\  \\ Log into the ople computer and kill the process labeled shutters with the PID as described in **Restarting Servers**  above. \\ Turn off the power to the Shutters with the switch on the computer rack which is to the left of the computer desk and marked "SHUTTERS". Restart the Shutters server with the commands below. After restarting the server and testing the gui to see that it works, turn the SHUTTERS power back on with the switch. There is a printed sheet of directions in the lab to help you. \\  \\ /usr/local/bin/tsockman rm shutters \\ ctrscrut/usr/local/bin/shutter_server /ctrscrut/chara/etc/shutter.cfg &
Line 73: Line 70:
  
 ===== Telescopes and Dome Servers ===== ===== Telescopes and Dome Servers =====
- 
-Here we discuss things that can go wrong with the telescopes. 
  
 ==== The Telescope won't move or stopped moving ==== ==== The Telescope won't move or stopped moving ====
Line 86: Line 81:
 2. Make sure you understand why the limit was hit which may require a trip to the telescope. If the azimuth positions on all telescope servers and dome guis match, it is likely the limit switch causing the stall and not that the scope is actually in a wrong position.</font><font inherit/inherit;;initial;;white>M</font>ake sure all the scopes’ demand positions agree – for example, sometimes bringing a scope to a configuration that’s already on sky and issuing a slew command will make the additional scope go around North the “wrong” way. \\  \\ <font 14px/Arial,Helvetica,sans-serif;;inherit;;inherit>3. Click the OVERRIDE ON button in domegui MANUAL tab. After this, the hardware doesn't care about the limits switches and you're free to move the telescope. \\  \\ 2. Make sure you understand why the limit was hit which may require a trip to the telescope. If the azimuth positions on all telescope servers and dome guis match, it is likely the limit switch causing the stall and not that the scope is actually in a wrong position.</font><font inherit/inherit;;initial;;white>M</font>ake sure all the scopes’ demand positions agree – for example, sometimes bringing a scope to a configuration that’s already on sky and issuing a slew command will make the additional scope go around North the “wrong” way. \\  \\ <font 14px/Arial,Helvetica,sans-serif;;inherit;;inherit>3. Click the OVERRIDE ON button in domegui MANUAL tab. After this, the hardware doesn't care about the limits switches and you're free to move the telescope. \\  \\
 4. Click ENABLE then you can move the telescope back to its normal range of operation. \\  \\ 4. Click ENABLE then you can move the telescope back to its normal range of operation. \\  \\
-5. After the telescope is back in it normal range, click OVERRIDE OFF which makes the hardware aware of the limits again and then hit AUTO on the AUTO tab to resume normal operation <font 16px/Calibri,sans-serif;;#333333;;white>**However, If you notice that a telescope will stop near AZ 90 or 270 with the scope still being ENABLED and refusing to move, this is often due to the new AZ limit switch being tripped at some point earlier. To get it moving again:**</font>+5. After the telescope is back in it normal range, click OVERRIDE OFF which makes the hardware aware of the limits again and then hit AUTO on the AUTO tab to resume normal operation.</font
 + 
 +<font 14px/Arial,Helvetica,sans-serif;;inherit;;inherit>However, If you notice that a telescope will stop near AZ 90 or 270 with the scope still being ENABLED and refusing to move, this is often due to the new AZ limit switch being armed around AZ 0º at some point earlier. To get it moving again:</font>
  
 <font 14px/Arial,Helvetica,sans-serif;;inherit;;inherit>1. On the domegui MANUAL tab, click STOP so pulses won't be sent to the drive by the control software. \\  \\ <font 14px/Arial,Helvetica,sans-serif;;inherit;;inherit>1. On the domegui MANUAL tab, click STOP so pulses won't be sent to the drive by the control software. \\  \\
 2.</font><font inherit/inherit;;initial;;white>M</font>ake sure all the scopes’ demand positions agree – for example, sometimes bringing a scope to a configuration that’s already on sky and issuing a slew command will make the additional scope go around North the “wrong” way. \\  \\ <font 14px/Arial,Helvetica,sans-serif;;inherit;;inherit>3. Click the OVERRIDE ON button in domegui MANUAL tab. After this, the hardware doesn't care about the limits switches and you're free to move the telescope.</font> 2.</font><font inherit/inherit;;initial;;white>M</font>ake sure all the scopes’ demand positions agree – for example, sometimes bringing a scope to a configuration that’s already on sky and issuing a slew command will make the additional scope go around North the “wrong” way. \\  \\ <font 14px/Arial,Helvetica,sans-serif;;inherit;;inherit>3. Click the OVERRIDE ON button in domegui MANUAL tab. After this, the hardware doesn't care about the limits switches and you're free to move the telescope.</font>
- 
-<font inherit/inherit;;initial;;white>2. M</font>ake sure the all the scopes’ demand positions agree – for example, sometimes bringing a scope to a configuration that’s already on sky and issuing a slew command will make the additional scope go around North the “wrong” way. <font inherit/inherit;;initial;;white>3. Click the OVERRIDE ON button in domegui MANUAL tab.</font> 
  
 <font 14px/Arial,Helvetica,sans-serif;;#333333;;inherit>4. Move the scope a bit back toward the direction it was coming from – for example, if the scope stopped at AZ 268 while rotating clockwise, move it back to 265 or so using AZ DEC. Then press STOP.</font> <font 14px/Arial,Helvetica,sans-serif;;#333333;;inherit>4. Move the scope a bit back toward the direction it was coming from – for example, if the scope stopped at AZ 268 while rotating clockwise, move it back to 265 or so using AZ DEC. Then press STOP.</font>
  
-<font 14px/Arial,Helvetica,sans-serif;;#333333;;inherit>5. Move the scope past AZ 270 by pressing AC INC. Usally 2-4 degrees will do.</font>+<font 14px/Arial,Helvetica,sans-serif;;#333333;;inherit>5. Move the scope past AZ 270 by pressing AC INC. Usally 2-4 degrees will do.  Go past the point that shows the limit is triggered.</font>
  
 <font 14px/Arial,Helvetica,sans-serif;;#333333;;inherit>6. Click OVERRIDE OFF, go to the AUTO tab and press AUTO, then NEXT (also in the obsgtk; this will restore the original star's demand position to the scope).</font> <font 14px/Arial,Helvetica,sans-serif;;#333333;;inherit>6. Click OVERRIDE OFF, go to the AUTO tab and press AUTO, then NEXT (also in the obsgtk; this will restore the original star's demand position to the scope).</font>
  
-<font 14px/Arial,Helvetica,sans-serif;;#333333;;inherit>7. When you have time, go to the scope and reset the limit switch; otherwise, it will stop each time you pass AZ 270/90.</font>+<font 14px/Arial,Helvetica,sans-serif;;#333333;;inherit>7. When you have time, go to the scope and reset the limit switch; otherwise, it will stop each time you pass AZ 270/90. The LED will show red when on the limit switch and is tripped, ie. limiting motion of the scope. The LED will be yellow if it has tripped and is in the caution range, but not on a limit switch. A fine Allen key can be used to push the internal reset button. It will turn the LED green when restored.</font>
  
 ==== The Telescope won't track ==== ==== The Telescope won't track ====
Line 111: Line 106:
 ==== Dome Server Restart ==== ==== Dome Server Restart ====
  
-To manually start the dome server:+Dome servers are now started using the bootlaunch_master command.  The manual process that has been superceded is archived.
  
-1. Make sure the power to the drives is OFF.+To restart the dome server:
  
-2Login to the relevant computer as rootFor example, type "s1" or "ssh s1" to log on to S1.+1Make sure the power to the drives is OFF Disable the scopes.
  
-3Work out the process ID number (PID)either with the command+2Login to the relevant computer as observe. For exampletype "s1" to log on to S1.
  
-(s1:1001) tsockman get dome_S1 \\ +3. Work out the process ID number (PID) by typing bootlaunch_master
-Name : dome_S1 \\ +
-Machine : s1.chara-array.org \\ +
-PID : 29953 \\ +
-Commands : -1 \\ +
-Data : -1 \\ +
-Message : 4002 \\ +
-Restart : /usr/local/bin/dome_server -A33.7441 S1 \\  \\ +
-or with \\ +
-(s1:1003ps aux | grep dome \\ +
-theo 4473 0.0 0.0 61188 748 pts/3 S+ 10:45 0:00 grep dome \\ +
-observe 29953 18.5 0.4 35596 9860 ? Sl Apr21 416:11 /usr/local/bin/dome_server -A33.7441 S \\  \\ +
-It can also be found by pulling up the LIST on SOCKMAN and selecting the relevant dome. \\  \\ +
-So in this case the PID is 29953. \\  \\ +
-4. Try and stop the server gracefully: kill -2 29953+
  
-5You should then check that the server has indeed stopped: \\  \\ +4Use kill -2 PID to kill the server.  Entering the command twice will show "No such process" if it has been killed.  if it is not killed, use kill -9.
-[s1:600] tsockman get dome_S1 \\ +
-Name : dome_S1 \\ +
-Machine : s1.chara-array.org \\ +
-PID : 15635 \\ +
-Commands : -1 \\ +
-Data : -1 \\ +
-Message : 2008 \\ +
-Restart : /usr/local/bin/dome_server -A41.0166 S1 \\  \\ +
-If the socket manager still thinks it's running you will need to stop it forcefully: kill -9 29953; tsockman rm dome_S1+
  
-6Restart the dome server by copying the command at the end of the /etc/rc.local file:+5Type bootlaunch_master again to restart the server.
  
-[s1:602] more /etc/rc.local \\ +6Turn the power to the drives back on.
-<<< Press space bar or enter to scroll through file>>> \\  \\ +
-#Run the dome server \\  \\ +
-/usr/local/bin/tsockman remove dome_S1 \\ +
-/usr/local/bin/dome_server -A41.0166 S1 & \\  \\ +
-(Note: the part of the command that saves information to /var/log/dome_S1.log has been removed.)+
  
-7. Turn the power to the drives back on.+7. Hit REOPEN and ENABLE on the domegtk, and type "otcs" in the telescope server.
  
-8. Hit REOPEN and ENABLE on the domegtk, and type "otcs" in the telescope server.+You may have to reinitialize the scope on a bright star. If the powers were turned off quickly when the problem was noticedthe position of the scope should be retained and slewing to a bright star will get it in the finder. If not, you may need to go out to the telescope to find the bright star to reacquire the scope's position.
  
-You may have to reinitialize the scope on a bright star. If the powers were turned off quickly when the problem was noticed, the position of the scope should be retained and slewing to bright star will get it in the finder. If not, you may need to go out to the telescope to find the bright star to reacquire the scopes position.+==== Telescope is not receiving the commanded position for target====
  
-==== Telescope clock is not correct ==== +Sometimes it happens that a telescope receives the wrong position for a target or does not receive the commanded position at all. The commanded position is listed on the telescope server in the first column under TCS Az/El; the second column lists the actual position of the telescope. Try entering the star designation directly into the telescope server, ie. hd 123456. If it does not accept the number, try closing and restarting the telescope server and hitting repoen on Cosmic Debris and the telescope gui. Try entering the star into the server again. If that does not work, it is possible that something is wrong with the dome server. To restart the dome server follow these steps above.
- +
-The NTP server has obsoleted this problem and fix as of 2014. +
- +
-==== Telescope is not receiving the commanded position for a target ==== +
- +
-Sometimes it happens that a telescope receives the wrong position for a target or does not receive the commanded position at all. The commanded position is listed on the telescope server in the first column under TCS Az/El; the second column lists the actual position of the telescope. Try entering the star designation directly into the telescope server, ie. hd 123456. If it does not accept the number, try closing and restarting the telescope server and hitting repoen on Cosmic Debris and the telescope gui. Try entering the star into the server again. If that does not work, it is possible that something is wrong with the dome server. To restart the dome server follow these steps: * DISABLE the telescope using either the telescope or dome GUI. +
- +
-  * Turn off the power for the telescope (both AZ/EL). +
-  * First shutdown the telescope server +
-  * Then use SOCKMAN to select the appropriate dome server from the list (dome_E1, etc) Get the PID number for reference. +
-  * The dome server will need to be restarted from the command line in a terminal. +
-  * Open a terminal window and log on to the telescope computer by typing "s1", "s2", "e1", "e2", "w1", or "w2" +
-  * Look for instruction in the /etc/rc.local file by typing "more /etc/rc.local". Scroll through the file by hitting space bar or Enter. +
-  * Locate instruction listed under "#Run the dome server". Copy and paste the commands at the prompt directly on the telescope computer: +
-      * /usr/local/bin/tsockman rm dome_S1 +
-      * /usr/local/bin/dome_server -A33.7441 S1 >& /var/log/dome_S1.log +
-  * (Replace S1 with commands for appropriate telescope) +
-  * After the dome server is running, re-open the telescope server and click [REOPEN} on the dome and telescope GUIs and on Cosmic Debris. +
-  * ENABLE the telescope using the telescope or dome GUI. Check to see if the command and telescope positions are correct. If they are, then turn on the power for the telescope drives. Re-enter the star information by entering the HD number in the telescope server and click [GO NEXT] to send the telescope to the star. Make sure that the telescope is behaving as expected. The telescope might have drifted a bit, so if you can't find the star, it might be necessary to use the Telrad on a bright star to re-initialize the pointing. +
-  * Restart commands for the dome servers are also listed on the desktop in a text file.+
  
 ==== Telescope is tracking poorly, overshooting in slew, oscillating. ==== ==== Telescope is tracking poorly, overshooting in slew, oscillating. ====
  
-This might mean that the gain for the tracking servo is wrong. Note that changing this gain can be dangerous, especially if you set it too high as that can cause the telescope to oscillated and damage the drives. Please only do this if you are very very sure that it is necessary. Symptoms of bad gain are: The scope over shoots the position while slewing. The star will be seen to move out of the window and may come back after a few seconds. This means the slewing gain is too low. The scope oscillates when tracking or after a slew. The star will be tracing an ellipse, figure eight or other looping shape. This means the tracking gain is too low. You can damp this out with the telescope or dome gui by disabling the scope, then re-enabling it. Adjust the gain upward and watch it on the next slew. In all cases if either gain is too high the scope will go into "Fog Horn" mode, which is bad. You always want to use the lowest gain that still allows the scope to work as best as possible. If the tiptilt tells you the scope is oscillating slowlythe gain may be too lowIf it is oscillating quickly it may be too high. On 10-22-2016, the gain settings were:+This might mean that the gain for the tracking servo is wrong. Note that changing this gain can be dangerous, especially if you set it too high as that can cause the telescope to oscillated and damage the drives. Please only do this if you are very very sure that it is necessary. Symptoms of bad gain are: The scope over shoots the position while slewing. The star will be seen to move out of the window and may come back after a few seconds. This means the slewing gain is too low. The scope oscillates when tracking or after a slew. The star will be tracing an ellipse, figure eight or other looping shape. This means the tracking gain is too low. You can damp this out with the telescope or dome gui by disabling the scope, then re-enabling it. Adjust the gain upward and watch it on the next slew. In all cases if either gain is too high the scope will go into "Fog Horn" mode, which is bad. This can be seen during slews on the twfs or labao as vibrating spotsusually in one axis.
  
-|Scope|AZ Slewing| |AZ Slewing| |AZ Tracking| |EL Tracking| |Date Updated| +You always want to use the lowest gain that still allows the scope to work as best as possible. If the tiptilt tells you the scope is oscillating slowly, the gain may be too low. If it is oscillating quickly it may be too high.
-| |Gain|Fn|Gain|Fn|Gain|Fn|Gain|Fn| | +
-|S1|7|4|4|3|16|4|10|7|10-26-2017| +
-|S2|7| |7| |10| |10| |10-22-2016| +
-|E1|7| |7| |13| |10| |10-22-2016| +
-|E2|7| |7| |13| |10| |10-22-2016| +
-|W1|13| |10| |13| |10| |10-22-2016| +
-|W2|7| |7| |13| |13| |10-22-2016|+
  
- \\ +The usual values for slewing gains are 4-7 and tracking gains are 7-10. Note that these values may change as the temperatures change rapidly. Gains are usually higher when cold and lower when warm. Be sure to set mode back to AUTO if changing the gains left it in Slewing or Tracking mode.
-Note that these values may change as the temperatures change rapidly. These are still being adjusted so this is by no means final. Be sure to set mode back to AUTO if changing the gains left it in Slewing or Tracking mode.+
  
 ==== How to Adjust CPUMotor Gains ==== ==== How to Adjust CPUMotor Gains ====
Line 240: Line 179:
  
 Sometimes when observing, the dome will not follow the telescope during a slew. This can happen when the Autodome feature is not turned on. Click the ON button on the MAIN tab of the telescope gui to enable it. This may happen after a server restart so always check the dome position with the spycam during a slew after a server restart. Also make sure the target position of the dome matches the telescope's position. If not, it will insist on being in the wrong place. If it is not at the same AZ as the scope, manually move it until it is centered on the telescope in spycam 1. If the dome AZ does not read the same as the telescope AZ, enter the scope AZ in the position box of the DOME tab of appropriate dome server and hit the INIT POS button to tell it at what AZ it is. \\  \\ Sometimes when observing, the dome will not follow the telescope during a slew. This can happen when the Autodome feature is not turned on. Click the ON button on the MAIN tab of the telescope gui to enable it. This may happen after a server restart so always check the dome position with the spycam during a slew after a server restart. Also make sure the target position of the dome matches the telescope's position. If not, it will insist on being in the wrong place. If it is not at the same AZ as the scope, manually move it until it is centered on the telescope in spycam 1. If the dome AZ does not read the same as the telescope AZ, enter the scope AZ in the position box of the DOME tab of appropriate dome server and hit the INIT POS button to tell it at what AZ it is. \\  \\
-If the dome does not turn at all, even with the manual controls on the telescope or dome guis, the control may be set to manual on the control box. This can happen if there was work done at the dome during the day. If the dome opens, but does not turn, check to see that control of the dome rotation is in the computer position and not manual on the dome rotation controller box just inside the door of the bunker.+If the dome does not turn at all, even with the manual controls on the telescope or dome guis, the control may be set to manual on the control box. This can happen if there was work done at the dome during the day. If the dome opens, but does not turn, check to see that control of the dome rotation is in the computer position and not manual on the dome rotation controller box just inside the door of the bunker. Sometimes the drive wheel jumps in the track and cannot turn the dome, even when the motor works. This will need to be fixed during the day.
  
 ==== Dome does not open ==== ==== Dome does not open ====
Line 253: Line 192:
  
 ===== HUT servers ===== ===== HUT servers =====
- 
-==== I can't change the camera settings on the TV ==== 
- 
-**CHECK THIS SECTION - THESE ARE VERY CLOSE BUT DIFFERENT PARAGRAPHS:** 
  
 The HUT servers control functions such as beacon and dichroic movements, heater and dehumidifier usage, and various AO functions. An observer may find that the obsgtk is no longer controlling the beacon LED's, beacon flat or dichroic alignments. This happens on occasion with E2 and other scopes. The HUT server may be the cause if it has quit or lost connection or the AOB may be at fault. To see if it is the server, open the HUT gui for the desired telescope from the CHARA menu. If the alignments can be changed from the gui, then the HUT server is ok. You can use the hut gui to continue observing. If the hut gui gives move error messages, cycle the power on the AOB and open a new hut server to restore the connection to the obsgtk. On the POWER gui, turn off the power to the AOB for the offending telescope and turn it back on. Stop the hut server by logging into the appropriate telescope computer and identifying the PID with the bootlaunch_master command and killing the process with the kill -9 #### command. Start the new server via the bootlaunch_master command. Hit REOPEN on the obsgtk to reopen the connection to the HUT server and hit reopen on Cosmic Debris as well. The HUT servers control functions such as beacon and dichroic movements, heater and dehumidifier usage, and various AO functions. An observer may find that the obsgtk is no longer controlling the beacon LED's, beacon flat or dichroic alignments. This happens on occasion with E2 and other scopes. The HUT server may be the cause if it has quit or lost connection or the AOB may be at fault. To see if it is the server, open the HUT gui for the desired telescope from the CHARA menu. If the alignments can be changed from the gui, then the HUT server is ok. You can use the hut gui to continue observing. If the hut gui gives move error messages, cycle the power on the AOB and open a new hut server to restore the connection to the obsgtk. On the POWER gui, turn off the power to the AOB for the offending telescope and turn it back on. Stop the hut server by logging into the appropriate telescope computer and identifying the PID with the bootlaunch_master command and killing the process with the kill -9 #### command. Start the new server via the bootlaunch_master command. Hit REOPEN on the obsgtk to reopen the connection to the HUT server and hit reopen on Cosmic Debris as well.
- 
-The HUT servers control functions such as finder and acq exposure times and gains, heater and dehumidifier usage, and various AO functions. An observer may find that the camera settings do not display or are not adjustable. The HUT server may be the cause if it has quit. To see if it is the server, open the HUT gui for the desired telescope from the CHARA menu. If the camera settings are displayed and the settings can be changed from the gui, then the HUT server is ok. Restart the telescope server to reopen the connection to the HUT server and hit reopen on Cosmic Debris as well. If the gui is not functioning, the HUT server will need to be restarted. The HUT servers run on the computer at each telescope. Find the PID from Sockman for the server you want to shut down. Open a terminal and log into the correct telescope computer (for example, type "s1" to log on to S1). Use the command "ps aux | grep HUT" to find the process you need to shut down. The number should match the PID from Sockman. Kill the process by typing "kill -9 pid" where pid is the process number and remove the entry from Sockman with the REMOVE button on the gui or by command line. Restart the hut server using the new "bootlaunch_hut" command described in the section on Restarting Servers. 
  
 If the server won't restart, a reboot of the power supply in the telescope bunker might be necessary. The power supply that controls the acquisition and finder cameras as well as their controllers is located on top of the computer rack in each bunker. The power supply has green readouts of volts and current. After turning the power off for 10 seconds and back on, try restarting the server from the computer in the bunker to see if it starts cleanly. If so, then restart the telescope server, reopen the connection to the telescope gui, and hit REOPEN on Cosmic Debris. Part of the HUT server also controls the AO table. If the AOB part of the HUT server doesn't work, then the power supply on the back of the AO table in the telescope dome might need recycling. This power supply controls the actuators at M2 and the AO table. The power supply box is a 6×9 inch aluminum box on the back of the AO table, behind the keyboard and monitor. Turn it off with the power button on the bottom edge of the box, wait 5 seconds and turn it back on. The HUT server should now restart cleanly. Restart the telescope server as well to make the connections to the telescope gui. Hit REOPEN in Cosmic Debris if you are observing to make all needed connections. If the server won't restart, a reboot of the power supply in the telescope bunker might be necessary. The power supply that controls the acquisition and finder cameras as well as their controllers is located on top of the computer rack in each bunker. The power supply has green readouts of volts and current. After turning the power off for 10 seconds and back on, try restarting the server from the computer in the bunker to see if it starts cleanly. If so, then restart the telescope server, reopen the connection to the telescope gui, and hit REOPEN on Cosmic Debris. Part of the HUT server also controls the AO table. If the AOB part of the HUT server doesn't work, then the power supply on the back of the AO table in the telescope dome might need recycling. This power supply controls the actuators at M2 and the AO table. The power supply box is a 6×9 inch aluminum box on the back of the AO table, behind the keyboard and monitor. Turn it off with the power button on the bottom edge of the box, wait 5 seconds and turn it back on. The HUT server should now restart cleanly. Restart the telescope server as well to make the connections to the telescope gui. Hit REOPEN in Cosmic Debris if you are observing to make all needed connections.
Line 267: Line 200:
  
 [[:chara:old_lab_tiptilt_server|Instructions for the old lab tiptilit system are archived here.]] [[:chara:old_lab_tiptilt_server|Instructions for the old lab tiptilit system are archived here.]]
- 
-The tiptilt server controls the CCD based tiptilt detection system. \\  \\ 
-Before you start the tiptilt server, you must ensure that the power to the cooling system and the CCD iteslf is on. It is extremely important that the cooler be running before you turn on the CCD and is only turned off if you are sure the CCD is NOT running. You can start the server from the X windows menu or with the command xtiptilt. \\  \\ 
-Note that there are background counts and read noise to deal with. Whenever you change the frame rate, please ensure that the bias frame is OK. The server will attempt to load an old bias frame that should work, but if things are not working, try making a new bias frame by ensuring that the detector is in the dark and typing "mkbias" into the tiptilt server. \\  \\ 
-In the tiptilt GUI windows, the white dots represent the starlight while the green dots represent the motion applied to telescope's secondary mirror to keep the starlight centered. When tiptilt is locked the white dots will be brought to the center of the tiptilt window. The green dots should be mostly centered also. W2 and E2 telescopes have a small oscillation that show as back and forth plots of the green dots. 
- 
-==== Tiptilt server complains about the CCD ==== 
- 
-Is the CCD turned on? When the tiptilt server starts up it tries no more than five times to communicate with the CCD. If they all fail, it will give up. If this happens, try cycling the power to the CCD and try again. If this fails, connect to the tiptilt machine and type the command rtccdAPIDemo, which should return with no errors. Try this command a few times, but if it still fails, there is a more serious problem. Turn off the CCD and reboot the tiptilt computer. If it still fails, I am afraid you are in more serious trouble. \\  \\ 
-Note that it is never a good idea to reboot machines unless you are very very sure it is necessary. The only reason to reboot tiptilt, other than a lock up of some kind, is that the clock interrupt has failed. You can test this by running the command "testclock" on a tiptilt command line. If this says the clock is working do not reboot the machine. \\  \\ 
-Also note that cycling the power on the CCD can cause harm so be sure you need to do it before trying it. Also, it is important to wait for at least 20 seconds after turning off the power before turning it on again. 
- 
-==== Tiptilt doesn't seem to be talking to the telescopes ==== 
- 
-Sometimes the telescope server will not show that TT is running. It will show 0Hz for a signal rate for TT. Running TIPTILT COMM will not get it started while other scopes do show it starting. Close and restart any telescope servers that won't connect after two tries of TIPTILT COMM. \\ 
-Note: There is more info in the software manual on this topic, but I wasn't sure if it was still relevant. 
- 
-==== Tiptilt server says the clock isn't running ==== 
- 
-First check whether the clock itself is running and the other machines receive the clock signal. Look at the clock cards at the back of other computers in the rack. The clock cards have three LEDs, one yellow and two greens. If the computer is receiving the clock signal properly all three LEDs should blink, but at a different rate. If the LEDs on all the clock cards are solid then reboot the GPS computer. When the GPS computer is down, it is best to cycle the power also on the box right above the GPS computer. \\  \\ 
-If the clock appears to be working properly on other machines and not on the tiptilt now it is time to reboot tiptilt. \\  \\ 
-[There is a bug in the real time part of the CCD code. It is caused by the clock in the tiptilt system either not running at all or having been set to a time very different from the last time the CCD ran.] \\  \\ 
-For the time being the only solution is to reboot tiptilt, but do so from the lab. Power OFF the CCD, then reboot the tiptilt machine and go into the BIOS. Make sure that interrupt 11 has been set to ISA legacy, save the BIOS and reboot. When the clock card LEDs in the tiptilt machine indicate proper clock signal, turn the CCD back on and start the tiptilt server. \\  \\ 
-Also, sometimes Serial Port 3 grabs IRQ 11 which stops the clock from running. Since there is no serial port 2 it's safe to disable this in the BIOS. This problem normally comes up when there has been a power outage. \\  \\ 
-Sometimes syncing the clock can also cause this problem, but that should be fixed soon. If it does, exit the tiptilt server, log in as root, and reload the tiptilt model using the following commands: \\ 
-/sbin/rmmod tiptilt_rt \\ 
-/sgin/insmod /usr/local/modules/tiptilt_rt.o \\  \\ 
-Note that it is never a good idea to reboot machines unless you are very very sure it is necessary. The only reason to reboot tiptilt, other than a lock up of some kind, is that the clock interrupt has failed. You can test this by running the command "testclock" on a tiptilt command line. If this says the clock is working do not reboot the machine. 
- 
-==== Tiptilt is not locking on a star or locks, but lets the star drift away ==== 
- 
-  * Are all the mirror covers open? [Note: W1 M7 cover sometimes needs two clicks to open, despite Scope-monitor indicates open.] 
-  * Has TIPTILT COMM been run from Cosmic Debris? 
-  * Check the ACQ alignment to make sure the tick marks are centered on the laser. 
-  * If there is plenty of starlight getting into tiptilt, then try re-initializing by clicking [TIPTILT COMM] on Cosmic Debris. 
-  * If there are a large number of background counts on TT, then close the M5 cover on the telescope, and click [DBIAS] on the TIPLTILT GUI to clear the background counts. Re-open the M5 cover. 
-  * If a star drifts even with TT locked, there could be a bright sky or light from other beams getting into the affected telescope's beam. This will show in the TT server as much lower counts than the other locked stars, but still high enough to lock TT. Use the laser to find the correct position to lock TT. 
-  * If TT unlocks and the star drifts, the TT servo may not have engaged. This is engaged with the TIPTILT button on the telescope gui. Clicking it may not start the servo the first time. Try it again if the green dots drift on the TT plot windows or if the Servo status in the telescope gui reads None instead of Wobb 1. 
-  * Are the TIPTILT buttons turned on from the POWER GUI? 
- 
-==== Tiptilt servo oscillates ==== 
- 
-You will see the oscillation in the green dots of the tiptilt GUI windows. Sometimes you can also see the oscillation in the white starlight dots or as an elongation of the star when looking at the ACQ field while tiptilt is locked. Some scopes have an oscillation that has not yet been diagnosed. W2 is one that usually oscillates. A diagonal motion in the tiptilt box indicates an oscillation in one axis only, while a vertical/horizontal motion indicates an oscillation in both directions. Motion from the upper right to lower left corresponds to elevation axis while motion from the upper left to lower right corresponds to the azimuth axis. (I think you can check direction by typing sin into telescope server to send sine waves to the telescope.) There are a few ways you can try to correct the oscillation manually tuning the servo: 
- 
-  * Type "tune" into the tiptilt server. Select the appropriate telescope. Turning the gain down normally helps. 
-  * Type "tune" into the appropriate telescope server. The default value for the proportional term is -0.5 and differential term is 0.0. Adjust these values between -0.2 to -1.0 for proportional and 0.0 to 0.2 for differential. 
-  * Read section on adjusting the Telescope Tracking Gain using the Dome Server GUI (ref). 
- 
-==== Tiptilt is saturating ==== 
- 
-  * Tiptilt saturates at ~ 200,000 counts. If you are near this limit, you can reduce the TT exposure time to lower the number of counts. 
-  * Set the TT exposure time on Cosmic Debris in the box for "Tiptilt (mS)". This will set the TT exposure time when slewing to a new target. 
-  * To change the exposure when already at a target, then click the [EXP] button on the TipTilt GUI. This will bring up a dialog box where you can enter the new exposure time in msec. Check to make sure the tiptilt frequency (in Hz) changes on the tiptilt server after changing the exposure time. If the frame rate doesn't change, then set the exposure time back to the old value and try entering the new value again. You might have to do this a few times to actually get the frame rate to change. 
-  * Frame rates for given exposure times: 
-      * ExpTime = 5 msec, Frame Rate = 157 Hz 
-      * ExpTime = 2 msec, Frame Rate = 299 Hz 
-      * ExpTime = 1 msec, Frame Rate = 427 Hz 
- 
-==== Tiptilt counts are way too low ==== 
- 
-Try using a slower frame rate or increasing the NSUM. Also ensure that the acquistion is properly aligned with the laser. To change the frame rate, click the [EXP] button on the tiptilt GUI and enter a longer integration time. Remember to change "Tiptilt (mS)" on Cosmic Debris to keep the same exposure time when slewing to the next target. 
- 
-==== Tiptilt counts are negative ==== 
- 
-The bias frame is bad. Get a new one or turn it off. 
  
 ===== OPLE and Metrology ===== ===== OPLE and Metrology =====
Line 350: Line 218:
  
   * Good metrology signals are important to the proper positioning of the carts. Monitor the signal strength by running [RUN MULTIPLE] on the Metrology Monitor. Place the windows above the TV windows for each scope you are using. They should show white sine waves that are around the height of the window. Erratic, fluctuating waves indicate self-interference or a weak signal. This may cause the carts to lose their place as the signal strength falls too low. Red waves indicate that some displayed signal has gone too low and the carts will all need to be homed. A careful adjustment of the MET2 mirror can sometimes bring the signal back. Do not adjust the MET1 mirror.   * Good metrology signals are important to the proper positioning of the carts. Monitor the signal strength by running [RUN MULTIPLE] on the Metrology Monitor. Place the windows above the TV windows for each scope you are using. They should show white sine waves that are around the height of the window. Erratic, fluctuating waves indicate self-interference or a weak signal. This may cause the carts to lose their place as the signal strength falls too low. Red waves indicate that some displayed signal has gone too low and the carts will all need to be homed. A careful adjustment of the MET2 mirror can sometimes bring the signal back. Do not adjust the MET1 mirror.
-  * To home the carts, turn off the [OL] and [MAN] buttons on each cart and it will automatically return to the front switch. If the cart has no issue, it will arrive at the target position of 0m and the home switch at the same time. The X in the OPLE server under the HM will indicate it has homed to the home switch. If a cart does not reach the home switch when it returns to position 0m, it was lost and likely the cause of the difficulty in finding fringes. You can also find the size of the error by typing the command "homechk S1" into the ople server. Hit the [HOMEbutton and it should move forward and trigger the home switch. Hit [TRACK] to home it to the new home position.+  * To home the carts, turn off the [OL] and [MAN] buttons on each cart and it will automatically return to the front switch. If the cart has no issue, it will arrive at the target position of 0m and the home switch at the same time. The X in the OPLE server under the HM will indicate it has stopped on the home switch. If a cart does not reach the home switch when it returns to position 0m, it was lost and likely the cause of the difficulty in finding fringes. The X does not guarantee the cart has retained its home position. This home position can be checked by using the CHECK button on the ople gui or by typing the command "homechk S1" into the ople server. When the error value is displayed on the ople server, hit ESC to clear the display. Click on the [OL] and [MAN] buttons and hit [TRACK] to send the cart to the desired cart position.
  
 ===== Beam Samplers ===== ===== Beam Samplers =====
Line 427: Line 295:
  
   * If the top of a server goes blank, try typing "sb" (start background) on the server command line.   * If the top of a server goes blank, try typing "sb" (start background) on the server command line.
-  * If the server screen fills with jibberish, try hitting CTRL-in the server to clear it.+  * If the server screen fills with jibberish, try hitting CTRL-in the server to clear it.
  
 ==== Server is frozen ==== ==== Server is frozen ====
Line 525: Line 393:
  
  \\  \\
-Last updated 2017-04-12+Last updated 2021-01-18 by Norm Vargas
  
  
chara/trouble_shooting.txt · Last modified: 2024/06/18 00:21 by charaobs