Td Mxc Smf Roush
-
Upload
armandochagoya -
Category
Documents
-
view
238 -
download
0
Transcript of Td Mxc Smf Roush
-
8/14/2019 Td Mxc Smf Roush
1/31
1
Let SMF Deal With That:
An introduction to theService ManagementFramework
Ellard RoushSun Microsystems
1
-
8/14/2019 Td Mxc Smf Roush
2/31
2
Agenda
Introduction to Service Management Framework(SMF)
Commands Demo Service Manifest and Development
Q&A
-
8/14/2019 Td Mxc Smf Roush
3/31
3
Service Management pre-SMF Daemons started by scripts delivered into /etc/rc*.d,
or by inetd (through /etc/inetd.conf)> Dependencies expressed through script numbering
(fragile, imprecise)
Common operations like stopping a service now &forever required two different steps> Easy to forget one> Often undone by patching or upgrade anyway
Daemon death ignored after start OS didn't know consequences of memory errors in
daemons had to panic
-
8/14/2019 Td Mxc Smf Roush
4/31
4
What is SMF?
It is solution to all the problems on the last slide
It is half of Predictive Self Healing>
It works with the Solaris Fault Manager to gracefullyrecover from uncorrectable hardware errors
It provides public, documented interfaces that ISVsand customers can use
It is used automatically> No need to turn it on> No way to turn it off
-
8/14/2019 Td Mxc Smf Roush
5/31
5
SMF basics:svc.startd
A new system daemon, svc.startd, has taken over mostofinit's responsibilities in starting system services
init still uses inittab, and /etc/rc*.d scripts are still run
svc.startd can automatically restart services> Ifsshd is enabled, then it is
started at boot
restarted if it dies (even if killed)
> sshd may be disabled by a single command stopped
not started at boot
not started after patch or upgrade
-
8/14/2019 Td Mxc Smf Roush
6/31
6
Service states
SMF lets the admin set whether each service isenabled or disabled
SMF keeps a state for each service
> uninitialized has not been evaluated yet
> disabled service is disabled, not running
> offline enabled, waiting for dependencies
> online enabled and running
> degraded running below full performance
> maintenance service problem occurred
-
8/14/2019 Td Mxc Smf Roush
7/317
Service dependencies Services may declare dependencies on each other
svc.startd starts services in dependency order
> Independent services started in parallel faster boot
Uncorrectable hardware errors handled better
> Daemon is restarted> Services which depend on it can be restarted
Enabled services hang out in the offline state until theirdependencies are met> A new command answers What services is service X
waiting for?
-
8/14/2019 Td Mxc Smf Roush
8/318
SMF configuration
Service meta-configuration(enabled status, state,dependencies, methods, etc.) iskept in the Service ConfigurationFacility (SCF), also known as the
SMF repository
The repository is controlled bysvc.configd, another newdaemon
The repository is (currently)stored in/etc/svc/repository.db
svc.startd
svc.configd
repository.db
SMF tools
-
8/14/2019 Td Mxc Smf Roush
9/319
Service names: FMRIs
Services are named by Fault Management ResourceIdentifiers, or FMRIs> URI syntax
svc:/system/cron:default
service name instancename
Note that while the service name usually contains slashes,
there are no service directories! The namespace is flat. Commands accept abbreviations (system/cron, cron)
and glob patterns
-
8/14/2019 Td Mxc Smf Roush
10/3110
Service instances
To allow configuration sharing,services are represented asinstance nodes which arechildren ofservice nodes
Both service nodes andinstance nodes can haveproperties
If an instance doesn't haveproperty X, the service'sproperty X is used
Dependencies on a service aresatisfied if any of its instancesare online
> Frees dependents from
knowing implementation
repository
service
instance
properties
properties
instance properties
service
service
-
8/14/2019 Td Mxc Smf Roush
11/3111
Commands:svcs(1)
$ svcsSTATE STIME FMRI....online 18:18:30 svc:/network/http:apache2online 18:18:29 svc:/network/smtp:sendmail....$ svcs -p sendmailSTATE STIME FMRIonline 18:18:29 svc:/network/smtp:sendmail
18:18:29 100180 sendmail18:18:29 100181 sendmail
$ svcs -d sendmailSTATE STIME FMRIonline 18:17:44 svc:/system/identity:domainonline 18:17:52 svc:/network/service:default
....
Without arguments, lists state, state-time, and FMRI ofservices that are enabled; with -a, lists all services
Show dependencies (-d) and dependents (-D)
Show member processes (-p), additional details (-v/-l)
-
8/14/2019 Td Mxc Smf Roush
12/3112
Commands:svcs -x
$ svcs telnet
STATE STIME FMRI
offline 7:38:17 svc:/network/telnet:default
$ svcs -x
svc:/network/inetd:default (inetd)
State: disabled since Wed Jan 25 07:38:17 2006
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: inetd(1M)
See: /var/svc/log/network-inetd:default.logImpact: 17 dependent services are not running. (Use -v for list.)
Answers the question: What's wrong with my system? Explains why services are offline, impact of non-running
services
Gives pointers to knowledge documents, log files to help
you determine the cause and find a remedy
-
8/14/2019 Td Mxc Smf Roush
13/3113
Commands:svcadm(1M) svcadm manipulates services
> svcadm enable enables services, services start whendependencies are ready
> svcadm disable disables services> svcadm restart stops and starts services>
svcadm refresh commits the current properties (to the runningsnapshot) and instructs the service to re-read its configuration> svcadm clear signals that a service in maintenance has been
fixed
These commands are asynchronous: they issue commands tosvc.startd and return immediately
With -s, enable & disable wait until completion (synchronous)
With -t, enable & disable are temporary (until next boot)
-
8/14/2019 Td Mxc Smf Roush
14/3114
Commands:svccfg(1M)
Interactive access to properties and snapshots
# svccfgsvc:> select network/http:apache2svc:/network/http:apache2> listprop...general frameworkgeneral/enabled boolean false...start methodstart/exec astring "/lib/svc/method/http-apache2 start"start/timeout_seconds count 60start/type astring methodsvc:/network/http:apache> editprop[ $EDITOR launches on a temporary file containing property settings ]svc:/network/http:apache2> exit# svcadm refresh apache2 # read latest configuration# svcadm restart apache2 # restart with latest configuration
-
8/14/2019 Td Mxc Smf Roush
15/3115
Commands Demo
-
8/14/2019 Td Mxc Smf Roush
16/3116
Troubleshooting
Service failures printed to console, syslog
Start with svcs -x output
> Often gives concise reason>
Provides link to knowledge document at sun.com> Gives path to log file
Use svcadm clear to clear maintenance state from repairedservices
Use svccfg to tweak debugging variables:> svccfg -s system/foo setenv LD_PRELOAD libumem.so
> svccfg -s system/foo setenv UMEM_DEBUG default
-
8/14/2019 Td Mxc Smf Roush
17/3117
Recovery
If a single service is broken, make sure you've got the latestservice config: svcadm refresh
Follow instructions from svcs -x pointer
Revert to a previous snapshot.
$ svccfg -s system/cron:defaultsvc:/system/cron:default> listsnapinitiallast-import
previousrunningstartsvc:/system/cron:default> revert startsvc:/system/cron:default> exit$ svcadm refresh cron$ svcadm restart cron
-
8/14/2019 Td Mxc Smf Roush
18/3118
Delegated Restarters svc.startd's model isn't right for all services (inetd, clustering)
SMF allows a service to be a delegated restarterfor otherservices> Start, stop, and refresh services however they want
> Responsible for managing instance states> svc.startd still handles enabledness & dependencies, though
inetd was reimplemented as a delegated restarter> Methods are called inetd_start, inetd_stop, etc.
> Services come online when inetd starts listening for them
> The repository is used for configuration instead ofinetd.conf
A public delegated restarter API is planned
-
8/14/2019 Td Mxc Smf Roush
19/3119
/etc/inetd.conf& inetadm(1M)
inetd.conf is no longer the primary configuration
Most Solaris inet services have been converted
Entries in inetd.conf are automatically convertedduring install & upgrade by inetconv(1M)
If something adds an entry to /etc/inetd.conf,inetd(1M) will detect and issue a warning message
> Run inetconv again to convert the new entry inetadm(1M) can be used to modify inetd-
specific properties
-
8/14/2019 Td Mxc Smf Roush
20/3120
Service Development: Benefits
Services appear with SMF FMRIs> Visible using standard Solaris tools; your service appears
in administrative heads-up displays
> Manageable using standard Solaris tools; admin canleverage existing knowledge to use your service
> New generic tools developed will automatically see yourservice
Built-in restart due to administrative error, software,or hardware fault
Participation in future software diagnosis
capabilities
-
8/14/2019 Td Mxc Smf Roush
21/3121
Service Development: Tasks
An existing Solaris service may be convertedincrementally, and to different levels
> Get it working: write a manifest using existing init script
as start/stop method> Handle error cases: refine methods
> Full restartability: if service has multiple components,split them into individual services
> Customized error/restart handling: avoid service restart iffault can be handled internally
-
8/14/2019 Td Mxc Smf Roush
22/3122
Service manifests
A service is delivered by an XML file called a manifest> Describes dependencies, methods, and properties
Manifests are delivered into /var/svc/manifest
During startup, new manifests in /var/svc/manifestand old manifests which have changed are loaded intothe repository with the svccfg(1M) command
Do not edit manifests in /var/svc/manifest; makecustomizations with svccfg(1M), etc.
> Repository customizations will be preserved acrosspatch & upgrade
-
8/14/2019 Td Mxc Smf Roush
23/31
23
Manifest Creation
Name your service
Identify whether your service may have multiple instances
Identify how your service is started/stopped
Determine faults to be ignored, if any
Identify dependencies
Identify dependents
Create at least one instance
Create template information to describe your service
-
8/14/2019 Td Mxc Smf Roush
24/31
24
Example Manifest: utmpd(1M)
utmpx monitoring
-
8/14/2019 Td Mxc Smf Roush
25/31
25
Method refinement
On failure, explain the problem to stdout orstderr (goes to a log) and exit with a non-0 code> If the failure is not transient, return$SMF_EXIT_ERR_FATAL or$SMF_EXIT_ERR_CONFIG
from /lib/svc/share/smf_include.sh
On success, don't return until service is ready toserve clients> Dependent services may be started immediately
-
8/14/2019 Td Mxc Smf Roush
26/31
26
Commands:svcprop(1)
List properties of services and instances
Fetch individual properties for use in scripts
$ svcprop network/http:apache2...physical/entities fmri svc:/network/physical:defaultphysical/grouping astring optional_allphysical/restart_on astring errorphysical/type astring servicestart/exec astring /lib/svc/method/http-apache2\ startstart/timeout_seconds count 60start/type astring methodstop/exec astring /lib/svc/method/http-apache2\ stopstop/timeout_seconds count 60stop/type astring methodrestarter/auxiliary_state astring nonerestarter/next_state astring nonerestarter/state astring disabledrestarter/state_timestamp time 1102030556.737590000$ svcprop -p enabled network/http:apache2false
-
8/14/2019 Td Mxc Smf Roush
27/31
27
Development: Other Examples
Manifest DTD is documented; read it at/usr/share/lib/xml/dtd/service_bundle.dtd.1
Explore /var/svc/manifest for similar services>
system/utmp is a simple standalone daemon> system/coreadm is a simple configuration service> network/telnet is an inetd-managed daemon
Explore /lib/svc/method for similar methods
-
8/14/2019 Td Mxc Smf Roush
28/31
28
Service Packaging
Use i.manifest and r.manifest from/usr/sadm/install/scripts
> (from S10U2 or OpenSolaris)
Manifests delivered into /var/svc/manifest withtype f and class manifest> Use /var/svc/manifest/site if the service is
specific to your site> Use another directory if you're an ISV, but remember a
uniquifier (e.g. stock ticker)
Methods delivered with your application binaries
(/opt strongly recommended)
-
8/14/2019 Td Mxc Smf Roush
29/31
29
Developer References
Manifest development> /usr/share/lib/xml/dtd/service_bundle.dtd.1
> Look in /var/svc/manifest for examples
> inetconv -i file to create an empty inetd manifest
> smf_method(5) information for writing methods
> inetd(1M) inetd-specific method information
-
8/14/2019 Td Mxc Smf Roush
30/31
30
Additional Resources
Discussion and further information athttp://opensolaris.org/os/community/smf Additional quickstart and developer documentation
available at
http://www.sun.com/bigadmin/content/selfheal/ Solaris System Administration Guide has SMF
information:http://docs.sun.com/app/docs/doc/817-1985
smf(5) manpage introduces the facility Blogs:
> http://blogs.sun.com/sch>
http://blogs.sun.com/lianep
http://opensolaris.org/os/community/smfhttp://www.sun.com/bigadmin/content/selfheal/http://docs.sun.com/app/docs/doc/817-1985http://blogs.sun.com/schhttp://blogs.sun.com/lianephttp://blogs.sun.com/lianephttp://blogs.sun.com/schhttp://docs.sun.com/app/docs/doc/817-1985http://www.sun.com/bigadmin/content/selfheal/http://opensolaris.org/os/community/smf -
8/14/2019 Td Mxc Smf Roush
31/31
Let SMF Deal With That:
An introduction to theService ManagementFramework
Ellard Roush